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Description 

RELATED APPLICATIONS 

s [0001] This application is a continuation-in-part ("CIP°): of application Serial Nos. 08,529,055, filed September 15, 
1995, 08/226,844, filed May 29, 1992, 08/093,907, filed May 29, 1 992, 07/884,918, filed July 5, 1994 (corresponding 
to PCT/US93/05191); of application Serial No. 08/482,981, filed June 7, 1995; of application Serial No. 08/458,399, 
filed June 2, 1995; of application Serial No. 08/446,201 , filed May 19, 1995 (as a CIP of USSN 08/246,636); of appli- 
cation Serial No. 08/246,636, filed May 20, 1 994 (as a CIP of USSN 08/048,896, filed April 20, 1993 as a CIP of USSN 

10 07/835,698, filed February 12, 1992 as a CIP of USSN 07/656,773); of application Serial 08/319,795, filed October 7, 
1994 (as a CIP of USSN 08/246,636); of application Serial No. 08/072,070, filed June 3, 1993; of application serial No. 
07/656,773, filed February 15, 1991 (USSN 656,773 and 835 ,696 corresponding to Infl application WO 92/1448); and, 
each of these applications, as well as each application, document or reference cited in these applications, is hereby 
incorporated herein by reference. Documents or references are also cited in the following text, either in a Reference 
List appended to certain Examples, or before the claims, or in the text itself; and, each of these documents or references 
is hereby expressly incorporated herein by reference. 

FIELD OF THE INVENTION 

20 [0002] This invention relates to pneumococcal genes, portions thereof, expression products therefrom and uses of 
such genes, portions and products; especially to genes of Streptococcus pneumoniae, e.g., the gene encoding pneu- 
mococcal surface protein A (PspA) (said gene being m pspA m ), pspA-Wke genes, pneumococcal surface protein C (PspC) 
(said gene being a psp(?), portions of such genes, expression products therefrom, and the uses of such genes, portions 
thereof and expression products therefrom. Such uses include uses of the genes and portions thereof for obtaining 
25 expression products by recombinant techniques, as well as for detecting the presence of Streptococcus pneumoniae 
or strains thereof by detecting DNA thereof by hybridization or amplification (e.g., PCR) and hybridization techniques 
(e.g., obtaining DNA-containing sample, contacting same with genes or fragment under PCR, amplification and/or 
hybridization conditions, and detecting presence of or isolating hybrid or amplified product). The expression product 
uses Include use In preparing antigenic, immunological or vaccine compositions, for eliciting antibodies, an immuno- 
so logical response (other than or additional to antibodies) or a protective response (Including antibody or other immuno- 
logical response by administering composition to a suitable host); or, the expression product can be for use in detecting 
the presence of Streptococcus pneumoniae by detecting antibodies to Streptococcus pneumoniae proteln(s) or anti- 
bodies to a portion thereof in a host, e.g., by obtaining an antibody-containing sample from a relevant host, contacting 
the sample with expression product and detecting binding (for instance by having the product labeled); and, the anti- 
cs bodies generated by the aforementioned compositions are useful In diagnostic or detection kits or assays. Thus, the 
invention relates to varied compositions of matter and methods for use thereof. 

BACKGROUND OF THE INVENTION 

[0003] Streptococcus pneumoniae is an important cause of otitis media, meningitis, bacteremia and pneumonia. 
Despite the use of antibiotics and vaccines, the prevalence of pneumococcal infections has declined little over the last 
twenty-five years. 

[0004] It is generally accepted that immunity to Streptococcus pneumoniae can be mediated by specific antibodies 
against the polysaccharide capsule of the pneumococcus. However, neonates and young children fail to make an 
45 immune response against polysaccharide antigens and can have repeated infections Involving the same capsular 
serotype. 

[0005] One approach to immunizing infants against a number of encapsulated bacteria is to conjugate the capsular 
polysaccharide antigens to protein to make them immunogenic. This approach has been successful, for example, with 
Haemophilus influenzae b (see U.S. Patent no. 4,496,538 to Gordon and U.S. Patent no. 4,673,574 to Anderson). 
so However, there are over eighty known capsular serotypes of S. pneumoniae of which twenty-three account for most 
of the disease. For a pneumococcal potysaccharide-protein conjugate to be successful, the capsular types responsible 
for most pneumococcal infections would have to be made adequately immunogenic. This approach may be difficult, 
because the twenty-three polysaccharides included in the presently-available vaccine are not all adequately immuno- 
genic, even in adults. 

S5 [0006] An alternative approach for protecting children, and also the elderly, from pneumococcal infection would be 
to identify protein antigens that could elicit protective Immune responses. Such proteins may serve as a vaccine by 
themselves, may be used in conjunction with successful polysaccharide-protein conjugates, or as carriers for polysac- 
charides. 
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[0007] McOaniel et al. (I), J. Exp. Med. 130:386-397, 1984, relates to the production of hybrtdoma antibodies that 
recognize cell surface pofypeptide(s) on 5. pneumoniae and protection of mice from Infection with certain strains of 
encapsulated pneumococci by such antibodies. This surface protein antigen has been termed "pneumococcal surface 
protein A" or PspA for short 

5 [0008] McDaniel et al. (II), Microbial Pathogenesis 1 :51 9-531 , 1 986, relates to studies on the characterization of the 
PspA. Considerable diversity in the PspA molecule in different strains was found, as were differences in the epitopes 
recognized by different antibodies. 

[0009] McDaniel et al. (Ill), J. Exp. Med. 165:381-394, 1987, relates to immunization of X-linked immunodeficient 
(XID) mice with non-encapsulated pneumococci expressing PspA, but not isogenic pneumococci lacking PspA, protects 

10 mice from subsequent fatal infection with pneumococci. 

[0010] MeDaniel et al. (IV), Infect. Immun., 59:222-226, 1991, relates to immunization of mice with a recombinant 
full length fragment of PspA that is able to elicit protection against pneumococcal strains of capsular types 6A and 3. 
[0011] Grain et al, Infect.lmmun., 56:3293-3299, 1990, relates to a rabbit antiserum that detects PspA in 1 00% (n = 
95) of clinical and laboratory isolates of strains of S. pneumoniae. When reacted with seven monoclonal antibodies to 

is PspA, fifty-seven S. pneumoniae isolates exhibited thirty-one different patterns of reactivity. 

[0012] The PspA protein type is independent of capsular type . It would seem that genetic mutation or exchange in 
the environment has allowed for the development of a large pool of strains which are highly diverse with respect to 
capsule, PapA, and possibly other molecules with variable structures. Variability of PspA's from different strains also'ls 
evident in their molecular weights, which range from 67 to 99 kD. The observed differences are stably inherited and 

20 are not the result of protein degradation. 

[0013] Immunization with a partially purified PspA from a recombinant X gtil clone, elicited protection against chal- 
lenge with several S. pneumoniae strains representing different capsular and PspA types, as described in McDaniel 
et al. (IV), Infect Immun. 59:222-228, 1991. Although clones expressing PspA' were constructed according to that 
paper, the product was insoluble and Isolation from cell fragments following lysis was not possible. 

25 [0014] While the protein is variable in structure between different pneumococcal strains, numerous cross-reactions 
exist between all PspA's, suggesting that sufficient common epitopes May be present to allow a single PspA or at least 
a small number of PspA's to elicit protection against a large number of S. pneumoniae strains. 
[0015] In addition to the published literature specifically referred to above, the inventors, in conjunction with co- 
workers, have published further details concerning PspA's, as follows: 

30 

1 . Abstracts of 89th Annual Meeting of the American Society for Microbiology, p. 125, item D-257, May 1 989; 

2. Abstracts of 90th Annual Meeting of the American Society for Microbiology, p. 98, item D-1 06, May 1 990; 
35 3. Abstracts of 3rd International ASM Conference on Streptococcal Genetics, p. 11 , item 12, June 1990; 

4. Talklngton et al, Infect, Immun. 59:1285-1289, 1991; 

5. Yother et al (I), J. Bacterid. 174:601-609, 1992; and 

40 

6. Yother et al (II), J. Bacteriot 174:610-618, 1992. 

7. McDaniel et al (V), Microbiol. Pathogenesis, 13261-268. 

45 [0016] It would be useful to provide PspA or fragments thereof in compositions, including PspA's or fragments from 
varying strains in such compositions, to provide antigenic, immunological or vaccine compositions; and, it is even 
further useful to show that the various strains can be grouped or typed, thereby providing a basis for cross-reactivities 
of PapA's or fragments thereof, and thus providing a means for determining which strains to represent in such com- 
positions (as well as how to test for, detect or diagnose one strain from another). 

so [0017] Further, it would be advantageous to provide a pspA - like gene or a pspC gene in certain strains, as well as 
primers (oligonucleotides) for identification of such a gene, as well as of conserved regions in that gene and in pspA\ 
for instance, for detecting, determining, isolating, or diagnosing strains of S. pneumonia. These uses and advantages, 
it is believed, have not heretofore been provided in the art. 

55 OBJECTS AND SUMMARY OF THE INVENTION 

[0018] The invention provides an isolated amino acid molecule comprising residues 1 to 115, 1 to 260, 192 to 588, 
192 to 299, or residues 192 to 260 of pneumococcal surface protein A of Streptococcus pneumoniae. 
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[0019] The invention further provides an isolated DNA molecule comprising a fragment of a pneumococcal surface 
protein A gene of Streptococcus pneumoniae encoding the isolated amino acid molecule. 
[0020] The invention also provides PCR primers or hybridization probes comprising the isolated DNA molecule. 
[0021 ] The invention additionally provides an antigenic, vaccine or immunological composition comprising the amino 
acid molecule. 

[0022] The invention includes an isolated DNA molecule comprising nucleotides 1 to 26, 1 967 to 1 990, 161 to 1 87, 
1 093 to 1 1 1 7, or 1 31 2 to 1 331 or 1 333 to 1 355 of a pneumococcal surface protein A gene of Streptococcus pneumoniae. 
The DNA molecule can be used as a PCR primer or hybridization probe; and therefore the invention comprehends a 
PCR primer or hybridization probe comprising the isolated DNA molecule. 

[0023] The invention also includes an isolated DNA molecule comprising a fragment having homology with a portion 
of a pneumococcal surface protein A gene of Streptococcus pneumoniae. The DNA preferably is the following (which 
include the portion having homology and restriction sites, and selection of other restriction sites or sequences for such 
DNA is within the ambit of the skilled artisan from this disclosure): 

CCGGATCCAGCTCCTGCACCAAAAAC ; 
GOGCGTCGACGGCTTAAACCGATTCACGATTGG ; 
CCGGATCCTGAGCCAGAGCAGTTGGCTG ; 
CCGGATCC6CTCAAAGAGATTGATGAGTCT6 ; 
GCGCATCCCGTAGCCAGTCAGTCTAAAGCTG ; 
CTGAGTCGACTOGAGTTTCTGGAGCTGGAGC ; 
~ CCGGATCCAGCfCCAGCTCGAGAAACTCCAG; * 
GCGGAT CCTTGACCAATATTTACGGAGGAGGC ; 
GTTTTTGGTGCAGGAGCTGG; 
GCTATGGGCTACAGGTTG ; 
CCACCT G TAGCCATAGC * 

CCGCATCCAGCGTGCCTATCTTAGGGGCTGGTT; and 
GCAAGCTTATGATATAGAAATTTGTAAC 

(thus, the invention broadly comprehends DNA homologous to portions of pspA; preferably further including restriction 
sequences). 

[0024] These DNA molecules can be used as PCR primers or probes; and thus, the invention comprehends a primer 
or probe comprising and of these molecules. 

[0025] The invention further still provides PCR probe(s) which distinguishes between pspA and pspA-Wko nucleotide 
sequence, as well as PCR probe(s) which hybridizes to both pspA and pspA-Wke nucleotide sequences. 
[0026] Additionally, the invention includes a PspA extract prepared by a process comprising: growing pneumococci 
in a first medium containing choline chloride, eluting live pneumococci with a choline chloride containing salt solution, 
and growing the pneumococci in a second medium containing an alkanolamine and substantially no choline; as well 
as a PspA extract prepared by that process and further comprising purifying PspA by isolation on a choline-Sepharose 
affinity column. These processes are also included in the invention. 

[0027] An immunological composition comprising thses extracts is comprehended by the invention, as well as an 
immunological composition comprising the full length PspA, 

[0028] A method for enhancing the immunogenlcity of a PspA-containing immunological composition comprising, in 
said composition, the C-terminal portion of PspA, is additionally comprehended, as well. 

[0029] An immunological composition comprising at least two PspAs. The latter immunological composition can have 
the PspAs from different groups or families; the groups or families can be based on RFLP or sequence studies (see, 
e.g., Fig. 13). 

[0030] Further, the invention provides an isolated amino acid molecule comprising pneumococcal surface protein C, 
PspC, of Streptococcus pneumoniae having an alpha-helical, proline rich and repeat regions, an isolated DNA molecule 
comprising a pneumcoccal surface protein C gene encoding the aforementioned PspC, and primers and hybrizatlon 
probes consisting essentially of the isolated DNA molecule. 

[0031] Still further, an isolated amino acid molecule comprising pneumococcal surface protein C, PspC, of Strepto- 
coccus pneumoniae is provided, having an alpha-helical, proline rich and repeat regions, having substantial homology 
with a protection eliciting region of PspA, and an isolated DNA molecule comprising a pneumcoccal surface protein C 
gene encoding the aforementioned PspC, and primers and hybrization probes consisting essentially of the isolated 
DNA molecule are provided by the present invention. 

[0032] Additionally, the present invention provides immunological compositions comprising PspC. 
[0033] These and other embodiments are disclosed or are obvious from the following detailed description. 
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BRIEF DESCRIPTION OF THE FIGURES 
[0034] 

5 Figures 1 A and 1B show: Evaluation of digested plasmld constructs . Fig. 1 A: 1% agarose gel electrophoresis of 

plasmids isolated from transformed E. coli BL21(DE3) strains stained with ethidium bromide. Lane 1: 1 kb DNA 
ladder (sizes noted in kb), lane 2: pRCT125; lane 3: pRC1 05, lane 4: DBL5 pspA insert, lane 5: pRCT11 3, lane 6: 
BG9739 pspA insert, lane 7: pRCT117, and lane 8: L81905 pspA insert. Fig. 1B: Corresponding Southern blot of 
gel in Fig. 1 A probed with full-length Rx1 pspA and hybridization detected as described in Example 1 . The arrow 

10 indicates the 1 .2 kb pspA digested inserts from plasmld constructs and the PCR-amplif ied pspA fragments from 

the pneumococcal donor strains used in cloning. 

Figure 2 shows: Evaluation of strain RCT105 cell fractions containing truncated DBL5 PspA . Proteins from E. coli 
ceil fractions were resolved by 1 0% SDS-PAGE, transferred to NC, and probed with MAb XIR278. Lane 1 : molecular 
weight markers (noted in kDa), lane 2: full-length, native DBL5 PspA, lane 3: un induced cells, lanes 4-6: induced 
15 cells; 1 hr, 2 hr, and 3 hr of IPTG induction respectively, lane 7: periplasmic proteins, fane 8: cytoplasmic proteins, 

and lane 9: insoluble cell wait/membrane material. 

Figure 3 shows: SDS-PAGE of R36A PspA (80 ng) column isolated from CDM-ET and an equal volume of an 
equivalent WG44.1 prep. Identical gels are shown stained with Bio-Rad silver kit (A) or immunoblotted with PspA 
MAb XiR278(B). The PspA Isolated from R36A shows the characteristic monomer (84 kDa) and dimer bands. 
20 Figure 4 shows: Cell lysates of pneumococcal isolates MC27 and MC28 were subjected to SDS-PAGE and trans- 

ferred to nitrocellulose for Western blotting with seven MAb to PspA. 7D2 detected a protein of 82 kDa in each 
isolate and XiR278 and 2A4 detected a protein of 190 kDa in each isolate. MAb Xi64, XI126, 1A4 and SR4W4 
were not reactive. Strains MC25 and MC26 yielded identical results. 

Figure 5 (Figs. 5A and 5B) shows: Southern blot of Hind III digest of MC25-MC28 chromosomal DNA developed 
25 at a stringency greater than 95 percent* A digest of Rx1 DNA was used as a comparison. The blot was probed 
with LSMpspAl 3/2, a full length Rx1 probe (Fig. 5) and LSMpspAl 2/6 a 5' probe of Rx1 pspA (Fig. 5). The same 
concentration of Rx1 DNA was used in both panels, but the concentrations of MC25-MC28 DNA In Fig. 5B were 
half that used in Fig. 5A to avoid detection of partial digests. 

Figure 6 shows: RFLP of amplified pspA, PspA from MC25 was amplified by PCR using 5' and 3' primers for pspA 
30 (LSM13 and LSM, respectively). The amplified DNA was digested with individual restriction endonucleases prior 

to electrophoresis and staining with ethidium bromide. Lane 1 Sdl, Lane 2 BAiWHI, Lane 3 BslNI, Lane 4 Psfl, 
Lane 5 Sad, Lane 6 EcoRI, Lane 7 Sma\, Lane 8 Kpnl 

Figure 7 shows: A depiction of PspA showing the relative location and orientation of the oligonucleotides. 
Figure 8 shows: Derivatives of the S. pneumoniae D39-Rxt family. 
35 Figures 9 to 10 show: Electrophoresis of pspA or amplified pspA product with Hha\ (Fig. 9), Sau3AI (Fig. 10). 

Figure 11 shows: RFLP pattern of two isolates from six families. 

Figure 12 shows: RFLP pattern of two isolates from six families (using products from amplification with SKH2 and 
LSM 13). 

Figure 13 shows: Sequence primarily in the N-terminal half of PspA. 
40 Figure 14 shows: Cell lysates of pneumococcal isolates MC27 and MC28, subjected to SDS-PAGE and Western 

blotting with seven MAbs to PspA; 7D2 detected a protein of 82 kDa in each isolate, and XI278 and 2A4 detected 
a protein of 190 kDa in each isolate; MAbs Xi64, Xi126, 1 A4 and SR4W4 were not reactive; strains MC25 and 
MC26 yielded identical results (not shown). 

Figure 1 5A and 15B show: a Southern blot of Hind III digest of MC25-28 chromosomal DNA, using a digest of Rx1 
45 DNA as a comparison; the blot was probed with LSMpspAl 3/2, a full length Rx1 probe (A), and LSMpspA12/6, a 

5' probe of Rx1 pspA (B); the same concentration of Rx1 DNA was used in both panels, but the concentrations of 
MC25-28 DNA in B were half that used in A to avoid detection of partial digests. 

Figures 15C and 15D show: the nucleotide sequences of primers LSM 13, LSM2, LSM12 and LSM6, and that of 
probes LSMpspAl 3/2 and LSMpspAl 2/6. 

50 Figure 1 6 shows: RFLP of amplified pspA, wherein PspA from MC25 was amplified by PCR using 5' and 3' primers 

forpspA (LSM13 and LSM 2, respectively); the amplified DNA was digested with individual restriction endonucle- 
ases prior to electrophoresis and staining with ethidium bromide; Bd I was used in lane 1 ; BarnH I was used in 
lane 2; BstN I was used in lane 3; Pst I was used in lane 4; Sac I was used in lane 5; EcoR I was used in lane 6; 
Sma I was used in lane 7; and Kpn I was used in lane 8. 

55 Figure 1 7 shows: position and orientation of oligonucleotides relative to domains encoded by pspA; numbers along 

the bottom of the Figure represent amino acids In the mature PspA polypeptide from strain Rx1 , and arrows rep- 
resent the relative position (not to scale) and orientation of oligonucleotides. 
Figure 1 8 shows: a restriction map of the pZero vector. 
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Figure 19 shows: the nucleotide sequences of SKH2, LSM13, N192 and C588. 

Figure 20 shows: a comparison of the structural motifs of PspA and PspC; PspA has a smaller alpha-helical region, 
and does not contain the direct repeats within the alpha-helix (indicated by the dashed lines); the alpha-helical 
regions which are homologous between PspA and PspC are indicated by the dashed lines); the alpha-helical 
5 regions which are homologous between PspA and PspC are indicated by the striped pattern; and PCR primers 

are indicated by the arrows. 

Figure 21 shows: the amino acid and nucleotide sequence of PspC, wherein the putative -1 0 and -35 regions are 
underlined, and the ribosomal binding site is in lower case. 

Figure 22 shows: the Bestfit analysis of PspA and PspC; percent identity is 69% and percent similarity is 77%; 
io amino acids of PspA are one the bottom line (1 -588) and amino acids of PspC are on the top line (249-891 ), and 
a dashed line Indicated identity. 

Figure 23 shows: the coiled coil motif of the alpha-helix of PspC; amino acids that are not in the coiled coil motif 
are In the right column. 

Figure 24 shows: a matrix plot comparison of the repeat regions of the alpha-helical region of PspC. 
is Figure 25 shows: the sequence of the alpha helical and proline regions of LXS532 (PspC.D39), 
Figure 26 shows: a comparison of nucleotides of pspAHxl to pspGD39. 
Figure 27 shows: a BESTFIT analysis of pspC. EF6797 and pspC. D39. 
Figure 2B shows: the amino acid comparison of PspC of EF6797 and D39. 
Figure 29 shows: the amino acid comparison of PspC.D39 and PspA.Rx1 . 

20 

DETAILED DESCRIPTION 

[0035] Knowledge of and familiarity with the applications incorporated herein by reference is assumed; and, those 
applications disclose the sequence of pspA as well as certain portions thereof, and PspA and compositions containing 

25 PspA. 

[0036] As discussed above and in the following Examples, the invention relates to truncated PspA, e.g., PspA C- 
terminal to position 192 such as a.a. 192-588 ("BCIOO") 192-299 and 192-260 of PspA eliciting cross-protection, as 
well as to DNA encoding such truncated PspA (which amplify the coding for these amino acid regions homologous to 
most PspAs). 

30 [0037] The invention further relates to a pspA-like gene, or a pspC gene and portions thereof (e.g., probes, primers) 
which can hybridize thereto and/or amplify that gene, as well as to DNA molecules which hybridize to pspA t so that 
one can, by hybridization assay and/or amplification, ascertain the presence of a particular pneumococcal strain; and, 
the invention provides that a PspC can be produced by the pspA-like or pspC sequence (which PspC can be used like 
PspA). 

35 [0038] Indeed, the invention further relates to oligonucleotide probes and/or primers which react with pspA and/or 
pspC of many, if not ail, strains, so as to permit identification, detection or diagnosis of any pneumococcal strain, as 
well as to expression products of such probes and/or primers, which can provide cross-reactive epitopes of interest. 
[0039] The repeat region of pspA and/or pspC is highly conserved such that the present invention provides oligonu- 
cleotide probes or primers to this region reactive with most, if not all strains, thereby providing diagnostic assays and 

40 a means for identifying epitopes of interest. 

[0040] The invention demonstrates that the pspC gene is homologous to the pspA gene in the leader sequence, first 
portion of the proline-nch region and in the repeat region; but, these genes differ in the second portion of their proline- 
rich regions and at the very 3' end of the gene encoding the 1 7 amino acid tail of PspA. The product of the pspC gene 
is expected to lack a C-terminal tail, suggesting different anchoring than PspA. Drug interference with functions such 

45 as surface binding of the coding for repeat regions of pspA and the pspC genes, or with the repeat regions of the 
expression products, is therefore a target for intervention of pneumococcal infection. 

[0041] Further still, the invention provides evidence of additional pspA homologous sequences, in addition to pspA 
and the pspC sequence. The invention, as mentioned above, includes oligonucleotide probes or primers which distin- 
guish between pspA and the pspC sequence, e.g., LSM1 and LSM2, useful for diagnostic detecting, or isolating pur- 
50 poses; and LSM1 and LSM1 0 or LSM1 and LSM7 which amplify a portion of the pspC gene, particularly the portion of 
that gene which encodes an antigenic, Immunological or protective protein. 

[0042] The invention further relates to a method for the isolation of native PspA by growth of pneumococci medium 
containing high concentrations of(about 0.9% to about 1 .4%, preferably 1 .2%) choline chloride, elation of live pneu- 
mococci with a salt solution containing choline chloride, e.g., about 1% about 3%, preferably 2% choline chloride, and 
55 growth of pneumococci in medium in which the choline in the medium has been almost or substantially completely 
replaced with a tower alkanolamine, e.g., C r C 6 , preferably alkanolamine, i.e., preferably alkanolamine, i.e., 
preferably ethanolamine (e.g., 0.0000005% to 0.0000015%, preferably 0.000001% choline chloride plus 0.02% to 
0.04% alkanolamine (ethanolamine), preferably 0.03%). PspA from such pneumococci is then preferably isolated from 
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achollne-eepharose affinity column, thereby providing highly purified PspA. Such Isolated and/or purified PspA is highly 
immunogenic and is useful in antigenic, immunological or vaccine composition. 

[0043] Indeed, the growth media of the pneumococci grown in the presence of the alkanotamine (rather than choline) 
contains PspA and is itself highly Immunogenic and therefore useful as an antigenic, immunological or vaccine com- 
position; and, is rather Inexpensive to produce. Per microgram of PspA, the PspA in the alkanotamine medium is much 
more protective than PspA isolated by other means, e.g., from extracts. Perhaps, without wishing to necessarily be 
bound by any one particular theory, there is a synergistic effect upon PspA by the other components present prior to 
isolation, or simply PspA is more protective (more antigenic) prior to isolation and/or purification (implying a possibility 
of some loss of activity from the step of isolation anchor purification). 

[0044] The Invention further relates to the N-terminal 115 amino acids of PspA, which is useful for compositions 
comprising an epitope of interest, immunological or vaccine compositions, as well as the DNA coding therefor, which 
is useful in preparing these N-terminal amino acids by recombination, or for use as probes and/or primers for hybrid- 
ization and/or amplification for identification, detection or diagnosis purposes. 

[0045] The invention further demonstrates that there is a grouping among the pspA RFLP families. This provides a 
method of identifying families of different PspAs based on RFLP pattern of pspAs, as well as a means for obtaining 
diversity of PspAs in an antigenic, immunological or vaccine composition; and, a method of characterizing clonotypes 
of PspA based on RFLP patterns of PspA. And, the invention thus provides oligonucleotides which permit amplification 
of most, e.g., a majority, if not all of S. pneumoniae and thereby permit RFLP analysis of a majority, if not all, S. pneu- 
moniae. 

[0046] The invention also provides PspC, having an approximate molecular weight of 105 kD, with an estimated pi 
of 6.09, and comprising an alpha-helical region, followed by a proline-rich domain and repeat region. A major cross- 
protective region of PspA comprises the C-terminal third of the alpha-helical region (between residues 192 and 260 of 
PspA), which region accounts for the binding of 4 of 5 cross-protective MAb, and PspA fragments comprising this 
region can elicit cross-protective immunity in mice. Homology between PspC and PspA begins at amino acid 148 of 
PspA, thus including the region from 192 to 299, and Including the entire PspC sequence C-terminal of amino acid 
486. Due to the substantial sequence homology between PspA and PspC in a region comprising the epitopes of interest, 
known to be protection eliciting, PspC is likely to comprise epitopes of Interest similar to those found in PspA. Antibodies 
specific for this region of PspA, i.e., between amino acids 148 and 299, should cross-react with PspC, and thus afford 
protection by reacting with PspC and PspA. Similarly, immunization with PspC would be expected to elicit antibodies 
cross-protective against PspA. 

[0047] An epitope of interest Is an antigen or immunogen or immunologically active fragment thereof from a pathogen 
or toxin of veterinary or human interest. 

[0048] The present invention provides an immunogenic, Immunological or vaccine composition containing the pneu- 
mococcal epitope of interest, and a pharmaceutical^ acceptable carrier or diluent. An immunological composition 
containing the pneumococcal epitope of interest, elicits an immunological response - local or systemic. The response 
can, but need not be, protective. Am immunogenic composition containing the pneumococcal epitope of interest, like- 
wise elicits a local or systemic immunological response which can, but need not be, protective. A vaccine composition 
elicits a local or systemic protective response. Accordingly, the terms "immunological composition" and "immunogenic 
composition" include a "vaccine composition" (as the two former terms can be protective compositions). 
[0049] The invention therefore also provides a method of inducing an immunological response in a host mammal 
comprising administering to the host an immunogenic, immunological or vaccine composition comprising the pneumo- 
coccal epitope of interest, and a pharmaceutical^ acceptable carrier or diluent. 

[0050] The DNA encoding the pneumococcal epitope of interest can be DNA which codes for full length PspA, PspC, 
or fragments thereof. A sequence which codes for a fragment of PspA or PspC can encode that portion of PspA or 
PspC which contains an epitope of interest, such as a protection-eliciting epitope of the protein. 
[0051] Regions of PspA and PspC have been identified from the Rx1 strain of S. pneumoniae which not only contain 
protection-eliciting epitopes, but are also sufficiently cross-reactive with other PspAs from other S. pneumoniae strains 
so as to be suitable candidates for the region of PspA to be incorporated into a vaccine, immunological or immunogenic 
composition. Epitopic regions of PspA include residues 1 to 115, 1 to 314, 192 to 260 and 192 to 588. DNA encoding 
fragments of PspA can comprise DNA which codes for the aforementioned epitopic regions of PspA; or it can comprise 
DNA encoding overlapping fragments of PspA, e.g., fragment 192 to 588 includes 192 to 260, and fragment 1 to 314 
includes 1 to 115 and 192 to 260. 

[0052] As to epitopes of interest, one skilled in the art can determine an epitope of immunodominant region of a 
peptide or polypeptide and ergo the coding DNA therefor from the knowledge of the amino acid and corresponding 
DNA sequences of the peptide or polypeptide, as well as from the nature of particular amino acids (e.g., size, charge, 
etc.) and the codon dictionary, without undue experimentation. 

[0053] A general method for determining which portions of a protein to use in an immunological composition focuses 
on the size and sequence of the antigen of interest. "In general, large proteins, because they have more potential 
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determinants are better antigens than small ones. The more foreign an antigen, that is the less similar to self config- 
urations which induce tolerance, the more effective it is in provoking an immune response." Ivan Roitt, Essential Im- 
munology , 1988. 

[0054] As to size, the skilled artisan can maximize the size of the protein encoded by the DNA sequence to be inserted 

5 Into the viral vector (keeping in mind the packaging limitations of the vector). To minimize the DNA Inserted while 
maximizing the size of the protein expressed, the DNA sequence can exclude introns (regions of a gene which are 
transcribed but which are subsequently excised from the primary RNA transcript). 

[0055] At a minimum, the DNA sequence can code for a peptide at least 8 or 9 amino acids long. This is the minimum 
length that a peptide needs to be in order to stimulate a CD4+ T cell response (which recognizes virus infected cells 

10 or cancerous cells). A minimum peptide length of 13 to 25 amino acids is useful to stimulate a CD&+- T cell response 
(which recognizes special antigen presenting cells which have engulfed the pathogen). See Kendrew, supra. However, 
as these are minimum lengths, these peptides are likely to generate an immunological response, i.e., an antibody or 
T cell response; but, for a protective response (as from a vaccine composition), a longer peptide is preferred. 
[0056] With respect to the sequence, the DNA sequence preferably encodes at least regions of the peptide that 

is generate an antibody response or a T cell response. One method to determine T and B cell epitopes involves epitope 
mapping. The protein of interest "is fragmented into overlapping peptides with proteolytic enzymes. The individual 
peptides are then tested for their ability to bind to an antibody elicited by the native protein or to induce T cell or B cell 
activation. This approach has been particularly useful in mapping T-cell epitopes since the T cell recognizes short 
linear peptides complexed with MHC molecules. The method is less effective for determining B-cell epitopes' 1 since B 

20 cell epitopes are often not linear amino acid sequence but rather result from the tertiary structure of the folded three 
dimensional protein. Janis Kuby, Immunology , (1992) pp. 79-80. 

[0057] Another method for determining an epitope of interest is to choose the regions of the protein that are hy- 
drophilic. Hydrophilic residues are often on the surface of the protein and therefore often the regions of the protein 
which are accessible to the antibody. Janis Kuby, Immunology , (1 992) P. 81 . 
25 [0058] Yet another method for determining an epitope of interest Is to perform an X-ray cyrstallographic analysis of 
the antigen (full length)-antibody complex. Janis Kuby, Immunology , (1992) p. 80. 

[0059] Still another method for choosing an epitope of interest which can generate a T cell response is to identify 
from the protein sequence potential HLA anchor binding motifs which are peptide sequences which are known to be 
likely to bind to the MHC molecule. 
30 [0060] The peptide which is a putative epitope, to generate a T cell response, should be presented in a M HC complex. 
The peptide preferably contains appropriate anchor motifs for binding to the MHC molecules, and should bind with 
high enough affinity to generate an immune response. Factors which can be considered are: the HLA type of the patient 
(vertebrate, animal or human) expected to be immunized, the sequence of the protein, the presence of appropriate 
anchor motifs and the occurence of the peptide sequence in other vital cells. 

6 [0061] An immune response is generated, in general, as follows: T cells recognize proteins only when the protein 
has been cleaved into smaller peptides and is presented in a complex called the "major histocompatability complex 
MHC located on another cell's surface. There are two classes of MHC complexes - class I and class II, and each class 
is made up of many different alleles. Different patients have different types of MHC complex alleles; they are said to 
have a 'different HLA type'. 

to [0062] Class I MHC complexes are found on virtually every cell and present peptides from proteins produced inside 
the cell. Thus, Class I MHC complexes are useful for killing cells which when infected by viruses or which have become 
cancerous and as the result of expression of an oncogene. T ceils which have a protein called CD4 on their surface, 
bind to the MHC class I cells and secrete tymphokines. The tymphokines stimulate a response; celts arrive and kill the 
viral infected cell. 

45 [0063] Class II MHC complexes are found only on antigen- presenting cells and are used to present peptides from 
circulating pathogens which have been endocytosed by the antigen- presenting cells. T cells which have a protein 
called CD8 bind to the MHC class II cells and kill the cell by exocytosis of lytic granules. 

[0064] Some guidelines in determining whether a protein is an epitopes of interest which wilt stimulate a T cell re- 
sponse, include: Peptide length - the peptide should be at least 8 or 9 amino acids long to fit into the MHC class I 

so complex and at least 1 3-25 amino acids long to fit into a class 1 1 MHC complex. This length is a minimum for the peptide 
to bind to the MHC complex. It is preferred for the peptides to be longer than these lengths because cells may cut the 
expressed peptides. The peptide should contain an appropriate anchor motif which will enable it to bind to the various 
class I or class II molecules with high enough specificity to generate an Immune response (See Bocchia, M. et al, 
Specific Binding of Leukemia Oncogene Fusion Protein Peptides to HLA Class I Molecules , Blood 852680-2684; 

55 Englehard, VH, Structure of peptides associated with class I and class II MHC molecules Ann. Rev. Immunol. 12:181 
(1994)). This can be done, without undue experimentation, by comparing the sequence of the protein of interest with 
published structures of peptides associated with the MHC molecules. Protein epitopes recognized by T cell receptors 
are peptides generated by enzymatic degradation of the protein molecule and are prestnted on the cell surface in 
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association with class I or class II MHC molecules. 

[0065] Further, the skilled artisan can ascertain an epitope of interest by comparing the protein sequence with se- 
quences listed in the protein date base. Regions of the protein which share little or no homology are better choices for 
being an epitope of that protein and are therefore useful in a vaccine or immunological composition. Regions which 

5 share great homology with widely found sequences present in vital ceils should be avoided. 

[0066] Even further, another method is simply to generate or express portions of a protein of interest, generate 
monoclonal antibodies to those portions of the protein of Interest, and then ascertain whether those antibodies inhibit 
growth in vitro of the pathogen from which the from which the protein was derived. The skilled artisan can use the other 
guidelines set forth in this disclosure and in the art for generating or expressing portions of a protein of interest for 
10 analysis as to whether antibodies thereto inhibit growth in vitro. For example, the skilled artisan can generate portions 
of a protein of interest by: selecting 8 to 9 or 13 to 25 amino acid length portions of the protein, selecting hydrophlllc 
regions, selecting portions shown to bind from X-ray data of the antigen (full iength)-antibody complex, selecting regions 
which differ in sequence from other proteins, selecting potential HLA anchor binding motifs, or any combination of 
these methods or other methods known in the art. 

[0067] Epitopes recognized by antibodies are expressed on the surface of a protein. To determine the regions of a 
protein most likely to stimulate an antibody response one skilled In the art can preferably perform an epitope map, 
using the general methods described above, or other mapping methods known in the art. 

[0068] As can be seen from the foregoing, without undue experimentation, from this disclosure and the knowledge 
in the art, the skilled artisan can ascertain the amino acid and corresponding DNA sequence of an epitope of interest 

20 for obtaining a T cell, B cell and/or antibody response. In addition, reference is made to Gefter et al. t U.S. Patent No. 
5,019,384, issued May 28, 1991 , and the documents it cites, incorporated herein by reference (Note especially the 
"Relevant Literature" section of this patent, and column 13 of this patent which discloses that; "A large number of 
epitopes have been defined for a wide variety of organisms of interest Of particular interest are those epitopes to 
which neutralizing antibodies are directed, Disclosures of such epitopes are in many of the references cited in the 

25 Relevant Literature section.") 

[0069] Further, the invention demonstrates that more than one serologically complementary PspA molecule can be 
in an antigenic, immunological or vaccine composition, so as to elicit better response, e.g., protection, for instance, 
against a variety of strains of pneumococci; and, the invention provides a system of selecting PspAs for a multivalent 
composition which includes cross-protection evaluation so as to provide a maximally efficacious composition. 

30 [0070] The determination of the amount of antigen, e.g. , PspA or truncated portion thereof and optional adjuvant in 
the inventive compositions and the preparation of those compositions can be in accordance with standard techniques 
well known to those skilled in the pharmaceutical or veterinary arts. In particular, the amount of antigen and adjuvant 
in the inventive compositions and the dosages administered are determined by techniques well known to those skilled 
in the medical or veterinary arts taking into consideration such factors as the particular antigen, the adjuvant (if present), 

6 the age, sex, weight, species and condition of the particular patient, and the route of administration. For instance, 
dosages of particular PspA antigens for suitable hosts in which an immunological response is desired, can be readily 
ascertained by those skilled in the art from this disclosure (see, e.g., the Examples), as is the amount of any adjuvant 
typically administered therewith. Thus, the skilled artisan can readily determine the amount of antigen and optional 
adjuvant in compositions and to be administered in methods of the invention. Typically, an adjuvant is commonly used 

40 as 0.001 to 50 wt% solution in phosphate buffered saline, and the antigen is present on the order of micrograms to 
milligrams, such as about 0.0001 to about 5 wt%, preferably about 0.0001 to about 1 wt%, most preferably about 
0.0001 to about 0.05 wt% (see, e.g., Examples below or in applications cited herein). 

[0071] Typically, however, the antigen is present in an amount on the order of micrograms to milligrams, or, about 
0.001 to about 20 wt%, preferably about 0.01 to about 10 wt%, and most preferably about 0.05 to about 5 wt% (see, 
« e.g., Examples below). 

[0072] Of course, for any composition to be administered to an animal or human, including the components thereof, 
and for any particular method of administration, it is preferred to determine therefor: toxicity, such as by determining 
the lethal dose (LD) and LD^ in a suitable animal model e.g., rodent such as mouse; and, the dosage of the composition 
(s), concentration of components therein and timing of administering the composition(s), which elicit a suitable immu- 
so nological response, such as by titrations of sera and analysis thereof for antibodies or antigens, e.g., by ELtSA and/ 
or RFFIT analysis. Such determinations do not require undue experimentation from the knowledge of the skilled artisan, 
this disclosure and the documents cited herein. And, the time for sequential administrations can be ascertained without 
undue experimentation. 

[0073] Examples of compositions of the invention include liquid preparations for orifice, e.g., oral, nasal, anal, vaginal, 
ss peroral, intragastric, mucosal (e.g., pertingual, alveolar, gingival, olfactory or respiratory mucosa) etc., administration 
such as suspensions, syrups or elixirs; and, preparations for parenteral, subcutaneous, intradermal, Intramuscular or 
intravenous administration (e.g., injectable administration), such as sterile suspensions or emulsions. Such composi- 
tions may be in admixture with a suitable carrier, diluent, or exciplent such as sterile water, physiological saline, glucose 
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or the like. The compositions can also be lyophiiized. The compositions can contain auxiliary substances such as 
wetting or emulsifying agents, pH buffering agents, gelling or viscosity enhancing additives, preservatives, flavoring 
agents, colors, and the like, depending upon the route of administration and the preparation desired. Standard texts, 
such as -REMINGTON'S PHARMACEUTICAL SCIENCE", 17th edition, 1985, incorporated herein by reference, may 

5 be consulted to prepare suitable preparations, without undue experimentation. 

[0074] Compositions of the invention, are conveniently provided as liquid preparations, e.g., isotonic aqueous solu- 
tions, suspensions, emulsions or viscous compositions which may be buffered to a selected pH. If digestive tract ab- 
sorption is preferred, compositions of the invention can be in the "solid" form of pills, tablets, capsules, caplets and the 
like, including "solid" preparations which are time-released or which have a liquid filling, e.g., gelatin covered liquid, 

io whereby the gelatin is dissolved in the stomach for delivery to the gut. If nasal or respiratory (mucosal) administration 
is desired, compositions may be in a form and dispensed by a squeeze spray dispenser, pump dispenser or aerosol 
dispenser. Aerosols are usually under pressure by means of a hydrocarbon. Pump dispensers can preferably dispense 
a metered dose or, a dose having a particular particle size. 

[0075] Compositions of the invention can contain pharmaceutically acceptable flavors and/or colors for rendering 

'5 them more appealing, especially if they are administered orally. The viscous compositions may be In the form of gels, 
lotions, ointments, creams and the like and will typically contain a sufficient amount of a thickening agent so that the 
viscosity is from about 2500 to 6500 cps, although more viscous compositions, even up to 1 0,000 cps may be employed. 
Viscous compositions have a viscosity preferably of 2500 to 5000 cps, since above that range they become more 
difficult to administer. However, above that range, the compositions can approach solid or gelatin forms which are then 

20 easily administered as a swallowed pill for oral ingestion. 

[0076] Liquid preparations are normally easier to prepare than gels, other viscous compositions, and solid compo- 
sitions. Additionally, liquid compositions are somewhat more convenient to administer, especially by injection or orally, 
to animals, children, particularly small children, and others who may have difficulty swallowing a pill, tablet, capsule or 
the like, or in multi-dose situations. Viscous compositions, on the other hand, can be formulated within the appropriate 

25 viscosity range to provide longer contact periods with mucosa, such as the lining of the stomach or nasal mucosa. 
[0077] Obviously, the choice of suitable carriers and other additives will depend on the exact route of administration 
and the nature of the particular dosage form, e.g., liquid dosage form [e.g., whether the composition is to be formulated 
into a solution, a suspension, gel or another liquid form], or solid dosage form [e.g., whether the composition is to be 
formulated into a pill, tablet, capsule, caplet, time release form or liquid-filled form]. 

30 [0078] Solutions, suspensions and gels, normally contain a major amount of water (preferably purified water) in 
addition to the antigen, lipoprotein and optional adjuvant. Minor amounts of other ingredients such as pH adjusters (e. 
g., a base such as NaOH), emutsifiers or dispersing agents, buffering agents, preservatives, wetting agents, jelling 
agents, (e.g., methylcellulose), colors and/or flavors may also be present. The compositions can be isotonic, i.e., it 
can have the same osmotic pressure as blood and lacrimal fluid. 

35 [0079] The desired isotonicity of the compositions of this invention may be accomplished using sodium chloride, or 
other pharmaceutically acceptable agents such as dextrose, boric acid, sodium tartrate, propylene glycol or other 
inorganic or organic solutes. Sodium chloride Is preferred particularly for buffers containing sodium ions. 
[0080] Viscosity of the compositions may be maintained at the selected level using a pharmaceutically acceptable 
thickening agent. Methylcellulose is preferred because it is readily and economically available and is easy to work with. 

40 Other suitable thickening agents include, for example, xanthan gum, carboxymethyl cellulose, hydroxypropyl cellulose, 
carbomer, and the like. The preferred concentration of the thickener will depend upon the agent selected. The important 
point is to use an amount which will achieve the selected viscosity. Viscous compositions are normally prepared from 
solutions by the addition of such thickening agents. 

[0081] A pharmaceutically acceptable preservative can be employed to increase the shelf-life of the compositions. 

45 Benzyl alcohol may be suitable, although a variety of preservatives including, for example, parabens, thimerosal, chlo- 
robutanol, or benzalkonium chloride may also be employed. A suitable concentration of the preservative will be from 
0.02% to 2% based on the total weight although there may be appreciable variation depending upon the agent selected. 
[0082] Those skilled in the art will recognize that the components of the compositions must be selected to be chem- 
ically inert with respect to the PspA antigen and optional adjuvant. This will present no problem to those skilled in 

50 chemical and pharmaceutical principles, or problems can be readily avoided by reference to standard texts or by simple 
experiments (not involving undue experimentation), from this disclosure and the documents cited herein. 
[0083] The immunologically effective compositions of this invention are prepared by mixing the ingredients following 
generally accepted procedures. For example the selected components may be simply mixed in a blender, or other 
standard device to produce a concentrated mixture which may then be adjusted to the final concentration and viscosity 

55 by the addition of water or thickening agent and possibly a buffer to control pH or an additional solute to control tonicity. 
Generally the pH may be from about 3 to 7.5. Compositions can be administered In dosages and by techniques well 
known to those skilled in the medical and veterinary arts taking into consideration such factors as the age, sex, weight, 
and condition of the particular patient or animal, and the composition form used for administration (e.g., solid vs. liquid). 
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Dosages for humans or other mammals can be determined without undue experimentation by the skilled artisan, from 
this disclosure, the documents cited herein, the Examples below (e.g., from the Examples involving mice). 
[0084] Suitable regimes for initial administration and booster doses or for sequential administrations also are variable, 
may include an initial administration followed by subsequent administrations; but nonetheless, may be ascertained by 

s the skilled artisan, from this disclosure, the documents cited herein, and the Examples below. 

[0085] PCR techniques for amplifying sample DNA for diagnostic detection or assay methods are known from the 
art cited herein and the documents cited herein (see Examples), as are hybridization techniques for such methods. 
And, without undue experimentation, the skilled artisan can use gene products and antibodies therefrom in diagnostic, 
detection or assay methods by procedures known In the art. 

10 [0086] The following Examples are provided for illustration and are not to be considered a limitation of the invention. 

EXAMPLES 

EXAMPLE 1 - TRUNCATED STREPTOCOCCUS PNEUMONIAE PspA MOLECULES ELICff CROSS-PROTECTIVE 
15 IMMUNITY AGAINST PNEUMOCOCCAL CHALLENGE 

[0087] Since the isolation of S. pneumoniae from human saliva in 1881 and its subsequent connection with lobar 
pneumonia two years later, human disease resulting from pneumococcal infection has been associated with a signif- 
icant degree of morbidity and mortality. A recent survey of urgently needed vaccines in the developing and developed 

20 world places an improved pneumococcal vaccine among the top three vaccine priorities of industrialized countries. 
The currently licensed vaccine is a 23-valent composition of pneumococcal capsular polysaccharides that is only about 
60% effective in the elderly and due to poor efficacy is not recommended for use in children below two years of age. 
Furthermore the growing frequency of multi-drug resistant strains of S. pneumoniae being isolated accentuates the 
need for a more effective vaccine to prevent pneumococcal infections. 

25 [0088] The immunogenic nature of proteins makes them prime targets for new vaccine strategies. Pneumococcal 
molecules being investigated as potential protein vaccine candidates include pneumolysis, neuraminidase, autolysin 
and PspA. All of these proteins are capable of eliciting immunity in mice resulting in extension of life and protection 
against death with challenge doses near the LD^. PspA is unique among these macromolecules in that is can elicit 
antibodies in animals that protect against inoculums 100-fold greater than the LDgg. 

30 [0089] PspA is a surface-exposed protein with an apparent molecular weight of 67-99 kDa that is expressed by all 
clinically relevant S. pneumoniae strains examined to date. Though PspAs from different pneumococcal strains are 
serologically variable, many PspA antibodies exhibit cross-reactivities with PspAs from unrelated strains. Upon active 
immunization with PspA, mice generate PspA antibodies that protect against subsequent challenge with diverse strains 
of S. pneumoniae. The immunogenic and protection-eliciting properties of PspA suggest that it may be a good candidate 

35 molecule for a protein-based pneumococcal vaccine. 

[0090] Four distinct domains of PspA have been identified based on DNA sequence. They include a N-terminal highly 
charged alpha-helical region, a proline-rich 82 amino acid stretch, a C-terminal repeat segment comprised of ten 
20-amino acid repeat sequences, and a 17-emino acid tail. A panel of MAbs to Rx1 PspA have been produced and 
the binding sites of nine of these Mabs were recently localized within the Rx1 pspA sequence In the alpha-helical 

40 region. Five of the Rx1 Mabs were protective in mice infected with a virulent pneumococcal strain, WU2. Four of these 
five protective antibodies were mapped to the distal third (amino acids 192-260) of the alpha-helical domain of Rx1 
PspA. 

[0091] Truncated PspAs containing amino acids 192-588 or 192-299, from pneumococcal strain Rx1 were cloned 
and the recombinant proteins expressed and evaluated for their ability to elicit protection against subsequent challenge 
45 with S. pneumoniae WU2. As with full-length Rx1 PspA, both truncated PspAs containing the distal alpha-helical region 
protected mice against fatal WU2 pneumococcal infection. However, the recombinant PspA fragment extending from 
amino acid 192 to 588 was more immunogenic than the smaller fragment, probably due to its larger size. In addition, 
the protection elicited by the amino acid fragment 192-588 of Rx1 was comparable to that elicited by full-length Rx1 
PspA. Therefore, cross-protective epitopes of other PspAs were also sought in the C-terminal two-thirds of the mole- 
so cule. As discussed below, PspAs homologous to amino acids 192-588 of strain Rx1 were amplified by PCR, cloned, 
and expressed in E. coii. Then three recombinant PspAs, from capsule type 4 and 5 strains, were evaluated for their 
ability to confer cross-protection against challenge strains of variant capsular types. The data demonstrate that the 
truncated PspAs from capsular type 4 and 5 strains collectively protect against or early death caused by challenge 
with capsular type 4 and 5 parental strains as well as type 3, 6A, and 6B S. pneumoniae. 
& [0092] Bacterial strains and culture conditions . All pneumococci were from the culture collection of this laboratory, 
and have been described (Yother, J. et al. t Infect. Immun. 1982; 36: 184-188; Briles, D.E., et al., Infect. Immun. 1992; 
60: 111-116; McDaniel, L.S., et al. t Mlcrob. Pathog. 1992; 13: 261-269; and McDaniel, LS, et al., In: Ferretti, J.J. et 
al., eds. Genetics of streptococci, enterococci, and lactococci . 1995; 283-286), with the exception of clinical isolates 
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TJ0893, 0922134 and BG8740. Pneumococcal strains TJ0893 and 0922134 were recovered from the blood of a 
43-year old male and an elderly female, respectively. & pneumoniae BG8743 is a blood isolate from an 8-month old 
Infant. Strains employed in this study Included capsular type 3 (A66.3, EF10197, WU2), type 4 (BG9739, EF3296, 
EF5668, L81905), type 5 (DBL5) t type 6A (DBL6A, EF6796), type 6B (BG7322, BG9163, DBL1), type 14 TJ0893), 
5 type 19 (BG8090), and type 23 (09221 34, BG8743). In addition, strain WG44.1 , which expresses no detectable PspA, 
was employed in PspA-spectfic antibody analysis. All chemicals were purchased from Fisher Scientific, Fair Lawn, 
New Jersey unless Indicate otherwise. 

[0093] S. pneumoniae were grown in Todd Hewitt broth (Difco, Detroit, Michigan) supplemented with 5% yeast extract 
(Difco). Mid-exponential phase cultures were used for seeding inocula in Lactated Ringer's (Abbott laboratories, North 
10 Chicago, Illinois) for challenge studies. For pneumococcal strains used in challenge studies, inocula ranged from 2.8 
to 3.8 log 10 CFU (verified by dilution plating on blood agar). Plates were incubated overnight in a candle jar at 37°C. 
[0094] E. CO//DH1 and BL21 (DE3) were cultured in LB medium (1 % Bacto-tryptone (Difco), 0.5% Bacto Yeast (Difco), 

0. 5% NaCI, 0.1% dextrose). For the preparation of cell lysates, recombinant £ coil were grown In minimal E medium 
supplemented with 0.05 M thiamine, 0.2% glucose, 0.1 % casamino acids (Difco), and 50 mgfrnl kanamycin. Permanent 

15 bacterial stocks were stored at -80°C in growth medium containing 1 0% glycerol. 

[0095] Construction of plasmid-based strains. pET-9a (Novagen, Madison, Wisconsin) was used for cloning truncat- 
ed pspA genes from fourteen S. pneumoniae strains: DBL5, DBL6A, WU2, BG9739, EF5668, L81905, 0922134, 
BG8090, BG8743, BG9163, DBL1, EF3296, EF6796, and E Ft 01 97 (Table 1). pspA gene fragments, from fifteen 
strains, were amplified by PCR using two primers provided by Connaught Laboratories, Swfftwater, Pennsylvania Prim- 

20 er N192- 5 , GGAAGGCCATATGCTCAAAGAGATTGATGAGTCT3' and primer C588 - 5'CCAAGG ATCCTTAAACCCA- 
TTCACC ATTGG C3' were engineered with Nde\ and BamH\ restriction endonuclease sites, respectively. PCR-amplified 
gene products were digested with BamHI and Nde\ t and ligated to linearized pET-9a digested likewise and further 
treated with bacterial alkaline phosphatase United States Biochemical Corporation, Cleveland, Ohio) to prevent reclr- 
cularization of the cut plasmid. Clones were first established in E. colt BL21 (DE3) which contained a chromosomal 

25 copy of the T7 RNA polymerase gene under the control of an Inducible lacUVS promoter. 

[0096] E. caff DH1 ceils were transformed by the method of Hanahan (Hanahan, D. J. Mol. Biol. 1 983; 166: 557-580). 
Stable transformants were identified by screening on LB-kanamycin plates. Plasmid constructs, isolated from each of 
these strains, were electroporated (Electro Cell Manipulator 600, BTX Electroporation System, San Diego, California) 
Into E coll BL21 (DE3) and their respective strain designations are listed in Table 1. The pET-9a vector alone was 

30 introduced into E. colt BL21 (DES) by electroporation to yield strain RCT1 25 (Table 2). All plasmid constructs and PCR- 
amplified pspA gene fragments were evaluated by agarose gel electrophoresis (with 1 kb DNA ladder, Gibco BRL, 
Gaithersburg, Maryland). Next, Southern analysis was performed using LMpspAl , a previously described full-length 
pspA probe (McDanlel. LS. et aL, Mlcrob. Pathog. 1992; 13: 261-269) random primed labeled with digoxigenin- 
11-dUTP (Genius System, Boehringer Mannheim, Indianapolis, Indiana). Hybridization was detected with chemilumi- 

35 nescent sheets according to the manufacturer's instructions (Schleicher & Schuell, Keene, New Hampshire). 

[0097] Cell fractionation of recombinant E. colt strains . Multiple cell fractions from transformed E. co/r were evaluated 
for the expression of truncated PspA molecules. Single colonies were inoculated into 3 ml LB cultures containing 
kanamycin and grown overnight at 37°C. Next, an 80 ml LB culture, inoculated with 1:100 dilution of the overnight 
culture, was grown at 37"C to mid-exponential phase (A^ of ca. 0.5) and a 1 ml sample was harvested and resus- 

40 pended (uninduced cells) prior to induction with isopropylthiogalactoside (IPTG, 0.3 mM final concentration). Following 

1 , 2, and 3 hr of induction, 0.5 ml of cells were centrifuges, resuspended, and labeled induced cells. The remaining 
culture was divided into two aliquots, centrifuged (4000 x g, 10 min, DuPont Sorvall RC 5B Plus), and the supernatant 
discarded. One pellet was resuspended in 5 ml of 20 mM Tris-HCI ph 7.4 200 mM NaCI,1 mM (ethylenedinitrilo)- 
tetraacetic acid disodium salt (EDTA) and frozen at -20°C overnight. Cells were thawed at 65°C for 30 min, placed on 

45 jce, and sonicated for vrve 1 0-sec pulses (0.4 relative output, Fisher Sonic Dlsmembrator, Dynatech Laboratories, Inc. 
Chantilly, Virginia). Next, the material was centrifuged (9000 x g, 20 min) and the supernatant was designed the crude 
extract-cytoplasmlc fraction. The pellet was resuspended in Tris-NaCI-EDTA buffer and labeled the Insoluble ceil well 
and membrane fraction. The other pellet, from the divided induced culture, was resuspended in 10 ml of 30 mM Tris-HCI 
pH 8.0 containing 20% sucrose and 1 mM EDTA and incubated at room temperature for 10 min with agitation. Cells 

so were then centrifuged, the supernatant removed, and the pellet resuspended in 5 mM MgS0 4 (1 0 ml, 1 0 min, shaking 
4°C bath). This material was centrifuged and the supernatant was designated osmotic shock-periplasmic fraction. Cell 
fractions were evaluated by SDS-PAGE and immunoblot analysis. 

[0098] MAbs to PspA . PspA-specific monoclonal antibodies (MAbs) XiR278 and 1 A4 were used as previously de- 
scribed (Craln, M.J. etal., 1990, Infect. Immun,; 58: 3293-3299). MAb P50-92D9 was produced by immunization with 
55 DBL5 PspA. The PspA-specificity of MAb P50-92D9 was confirmed by Western Analysis by its reactivity with native 
PspAs from S. pneumoniae DBL5, BG9739, EF5668, and L81095 and Its failure to recognize the PspA-control strain 
WG44.1. 

[0099] SDS-PAGE and immunoblot analysis . E. coii cell fractions containing recombinant PspA proteins and bioti- 



12 



EP 1 477 185 A2 



nylated molecular weight markers (low range, Bio-Rad, Richmond, California) were separated by sodium dodecyl sul- 
fate-potyacrylamlde (10%; Bethesda Research Laboratories, Gaithersburg, Maryland) gel electrophoresis 
(SDS-PAGE) by the method of Laemmll (Laemmti, U.K. Nature 1970; 227: 680-685). Samples were first boiled for 5 
mln in sample buffer containing 60 mM Tris pH 6.8, 1% 2-B-mercaptoethanol (Sigma, St. Louis, Missouri), 1% SDS, 

5 10% glycerol, and 0.01 % bromophenol blue. Gels were subsequentfy transferred (1 hr, 1 00 volts) to nftrocellulose (0.45 
mM pores, Millipore, Bedford, Massachusetts) as per the method of Towbin et al. Blots were blocked with 3% casein, 
0.05% Ttoeen 20 in 10 mM Tris, 0.1 M Nad, pH 7.4 for 30 mln prior to incubating with PspA-specific monoclonal 
antibodies diluted in PBST f or 1 h r at 25°C. Next, the blot was washed 3 times with PBST before incubating with alkaline 
phosphatase-labeled goat anti-mouse immunoglobulin (Southern Biotechnology Associates, Inc., Birmingham, Ala- 

*o bama) for 1 hr at 25°C. Washes were performed as before and blots was developed with 0.5 mg/rnl 5-bromo-4-chloro- 
3-lndolyl phosphate and 0.01% nitro blue tetrazolium (Sigma) first dissolved in 150 ^1 of dimethyl sulfoxide and then 
diluted in 1 .5 M Tris-HCI pH 8.8. Dot blots were analyzed similarly. Lysate samples (2 \l\) were spotted on nitrocellulose 
filters (Millipore), allowed to dry, blocked, and detected as just described. 

[0100] Preparation of ceil lysates containing recombinant PspA proteins . Transformed £ coli strains RCT105, 
is RCT113, RCT117, and RCT125 (Table 2) were grown in mid-exponential phase In minimal E medium before IPTG 
induction (2 mM final concentration, 2 hours, 37°C). Cultures were harvested by centrlfugatlon (10 min at 9000 x g), 
resuspended in Trts-acetate pH 6.9, and frozen at -80°C overnight Samples were thawed at 65°C for 30 min, cooled 
on ice, and sonicated. Next the samples were treated with 0.2 mM AEBSF (Calbiochem, La Jolla, California) at 37°C 
for 30 min and finally centrifuged to remove cell wall and membrane components. Dot blot analysis was performed 
20 using PspA-specific MAbs to validate the presence of recombinant, truncated PspA molecules in the lysates prior to 
their use as immunogens in mice. Unused lysate material was stored at -20°C until subsequent immunizations were 
performed. 

[01 01] Mouse immunization and challenge . CBA/CAHN-XID/J mice (Jackson Laboratories, Bar Harbor, Maine), 6-12 
weeks old, were employed for protection studies. These mice carry a X-linked immunodeficiency that prevents them 

25 from generating antibody to polysaccharide components, thus making them extremely susceptible to pneumococcal 
infection. Animals were immunized subcutaneously with eel! lysates from E coli recombinant strains RCT1 05, RCT1 1 3, 
RCT117, and RCT125 (Table 2) in complete Freund's adjuvant for primary immunizations. Secondary injections were 
administered in incomplete adjuvant and subsequent boosts in dH z O. Immunized and nonimmunized mice (groups of 
2 to 5 animals) were challenged with S. pneumoniae strains A66.3, BG7322, DBL6A, WU2, DBL5, BG9739, and L81 905 

30 intravenously (tail vein) to induce pneumococcal sepsis. Infected animals were monitored for 21 days and mice that 
survived the 3-week evaluation period were designated protected against death and scored as surviving 22 days for 
statistical analysis. Protection that resulted in extension of life was calculated as a comparison between mean number 
of days to death for Immunized versus pooled control mice (nonimmunized and RCT125 sham-Immunized; total of 6-7 
animals). 

35 [01 02] Determination of PspA serum levels . Mice were bled retro-orbitally following the secondary boost and again 
prior to challenge. Representative mouse titers were evaluated by enzyme-linked immunorsorbent assay (ELISA) using 
native, parental PspAs isolated from pneumococcal strains DBL5, BG9739, and L81905. PspAs were immobilized on 
mterotiter plates by incubating in 0.5 NaHC0 3 , 0.5 M Na2CQ3pH9.5 at 4°C overnight Alkaline phosphatase-labeled 
goat anti-mouse Immunoglobulin (Southern Biotechnology Associates, Inc.) was used to detect mouse serum antibod- 

40 ies. Color development was with p-nrtrophenyl phosphate (Sigma, 1 mg/ml) in 0.5 m MgCLg pH 9.8 with 1 0% dieth- 
anolamine and absorbance was read at 405 nm after a 30 min incubation. Reciprocal titers were calculated as the last 
dilution of antibody that registered an optical density value of 0.1 . Sera from individual mice within a particular immu- 
nogen group were evaluated separately and then the respective titers from four mice per group were combined to 
obtain titer range (Table 3). 

45 [0103] Statistics . The one-tailed Fisher exact and two sample rank tests were used to evaluate protection against 
death and extension of life in the mouse model. 

[0104] Cloning of truncated pspA genes . Using primers N192 and C588, truncated pspA genes from fifteen diverse 
pneumococcal strains representing eight different capsular types (Table 1) were amplified by PCR. Even though var- 
iability exists in pspA genes from different strain, this result demonstrates that sufficient conservation exists between 
so variant pspA genes to allow sequence amplification In all strains examined to date. Successful pspA PCR-amplification 
extended to ail capsule types evaluated. 

[0105] Fourteen of the amplified pspA genes were cloned and three clones containing truncated PspA molecules 
from pneumococcal strains DBL5, BG9739, and L81905 were further studies (Table 2). To verify the constructions, 
plasmids from recombinant £ coH strains (RCT1 05, RCT1 1 3, RCT1 1 7, and RCT1 25 (Table 2) were isolated, digested 
55 with Afctel and S4MHI restriction endonucleases, and electrophoresed in 1% agarose side-by-slde with the PCR prod- 
ucts used in their respective constructions (Figure 1A). The digestion reaction was complete for pRCT105, while 
pRCT113 and pRCT1 17 digestions were incomplete (lanes 5 and 7, respectively). This gel was denatured and DNA 
transferred to nylon for Southern analysis. Figure 1 B depicts the corresponding Southern blot probed with full-length 
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RxIpspA DMA. Lane 1 contains pRCT125, digested vector alone, which does not react with the pneumococcal DNA- 
specific probe, as expected. The pspA-specttic probe hybridized with the PCT products and the digested plasm Id 
Inserts (see arrow, Figure 1B) as well as the partially undigested pRCT113 and pRCT117 (lane 5 and 7), confirming 
successful cloning of DBL5, BG9739, and L81905pspA DNA. Constructions were similarly confirmed with the eleven 
5 additional recombinant strains containing truncated pspA genes from S. pneumoniae strains of different capsular and 
PspA types. 

[0106] Expression of recombinant PspA in £ coli B1 21 (De3) . Transformed £ constrains RCT105, RCT113, RCT117, 
and RCT1 25 were cultured to mid-exponential phase prior to the addition of I PTG to induce expression of the cloned, 
truncated pspA gene in each strain. A cell fractionation experiment was performed to identify the location of recombinant 
10 PspA proteins in transformed £ coli strains. Samples representing uninduced cells, included cells (1 hr, 2 hr, and 3 hr 
time Intervals), the pertplasmic fraction, the cytoplasmic fraction, and Insoluble cell wall/membrane material were re- 
solved by SDS-PAGE. Proteins were then transformed to nitrocellulose and Western analysis was performed using 
monoclonal antibodies specific for PspA epitopes. 

[01 07] Figure 2 reveals that both the cytoplasmic (lane 8) and the insoluble matter fractions (lane 9), from recombinant 
15 strain RCT 1 05, contain a protein of approximately 53.7 kDa that is recognized by MAb XIR278 that is not seen in the 
uninduced cell sample (lane 3). This protein increases in quantity in direct correlation with the length of I PTG Induction 
(lanes 4-6; 1 hr, 2 hr, and 3 hr respectively). No truncated RCT105 PspA was found In the periplasmics fraction (lane 
7), which was expected since the pET-9a vector lacks a signal sequence that would be necessary for directing proteins 
to the periplasm. The observed molecular weight (ca. 53.5 kDa) is larger than the predicted molecular weight for the 
20 1 ,2 kb QBUSpspA gene product (43.6 kDa; Figure 1 A, lane 4). Like full-length Rx1 PspA, the observed and predicted 
molecular weights for truncated PspAs do not agree precisely. In addition, immunoblot analysis was performed for 
recombinant £ coli strains RCT113, and RCT117 (using MAbs 1A4 and P50-92D, respectively) and similar results 
were obtained, while no cell fractions from control strain RCT125 were recognized by MAb XIR278. 
[01 08] Evaluating the protective capacity of recombinant, truncated PspAs . The truncated PspA proteins from strains 
25 RCT1 1 3, RCT1 1 7, and RCT1 05 were expressed and analyzed for their ability to generate cross-protection against a 
battery of seven S. pneumoniae strains. Control mice (non-immunized and RCT1 25 sham-immunized) and recombinant 
PspA-immunized mice were challenged with mouse-virulent strains A66.3, BG7322, DBL6A, WU2, DBL5, BG9739, 
and L81905. Table 3 presents the day of death for each infected mouse. 

[0109] Immunization with truncated PspA from RCT113, RCT117, and RCT105 conferred protection against death 
30 for all mice challenged with capsular type 3 strains (A66.3 and WU2 (Table 3). The three truncated PspAs also provided 
significant protection against death with DBL6A, and BG7322 pneumococci (capsular types 6A and 6B, respectively). 
In addition, immunization with recombinant RCT113 PspA extended days to death in mice challenged with strains 
DBL5, BG9739, and L81905, while RCT117 PspA prolonged the lives of mice inoculated with BG9739 pneumococci 
(Table 3). Truncated BG9739 PspA elicited protection against all challenge strains (1 00%) evaluated in this study, while 
35 recombinant L81 905 and DBL5 truncated PspAs conferred protection against death with 71 % and 57% of S. pneumo- 
niae challenge strains, respectively. 

[0110] Anti-PspA antibody titers elicited by the three immunogens vary over approximately a 1 0-fold range (Table 
3). The lowest antibody levels were elicited by RCT105 and this truncated PspA also elicited protection against the 
fewest number of challenge strains. RCT113 and RCT117 elicited three and nine time as much anti-PspA antibody, 
40 respectively. As expected, no antibody to PspA was detected in nonimmunized mice nor was specffic-PspA antibody 
measured in mice immunized with the vector-only control strain (RCT125). 

[0111] In summary, immunization with RCT113 and RCT117 PspAs protected mice against fatal challenge with cap- 
sular type 3 and 6A strains and extended life for mice Inoculated with type 4, 5, and 6B pneumococci. RCT1 05 PspA 
immunization protected against fatal Infection with capsular type 3 and 6B strains and prolonged time to death for type 

45 gas. pneumoniae but offered not protection against type 4 and 5 strains. These data demonstrate that truncated 
PspAs from capsular type 4 and 5 pneumococci collectively protect mice and ergo other hosts, such as humans, against 
or delay death caused by each of the seven challenge strains. In general, however, more complete protection was 
observed against strains of capsular type 3, 6A, and 6B than against type 4 and 5 S. pneumoniae. 
[0112] PspA has been shown to be a protection-eliciting molecule of S. pneumoniae. Immunization with PspA has 

so also been shown to be cross-protective, although eliciting more complete protection against certain strains than others. 
Thus, it is possible that a broadly protective PspA vaccine might need to contain PspAs of more than one pneumococcal 
strain. The distal third of the alpha-helical region of PspA has been identified as a major protective region of PspA. 
Moreover, this region is presented In a very antigenic form when expressed with the intact C-terminal half of the mol- 
ecule. In this Example, the ability to use truncated PspA proteins homologous to the region of Rx1 PspA extending 

53 from amino acid residue 192 to the C-termfnus at residue 588 is demonstrated. 

[01 13] The C-terminal two-thirds of PspA was cloned from fourteen strains by PCR amplification of a gene fragment 
of the appropriate size (1 .2 kb) which hybridized with full-length Rx1 pspA. Successful PCR amplification extended to 
all capsule types analyzed. Thus, the C-terminal two-third of PspA may be amplified from many, If not all, pneumococcal 
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capsule types with Rx1 pspA-specK\c primers. This technique is thus applicable to the development of antigenic im- 
munological or vaccine compositions containing multiple PspA or fragments thereof. 

[01 1 4] Of these clones, three tru ncated PspA proteins were expressed and evaluated In mouse immunization studies 
to determine their ability to cross-protect against challenge with a variety of pneumococcal capsular types. All three 

s recombinant PspAs elicited antibody reactive with their respective donor PspA and all three elicited protection against 
pneumococcal infection. Of the two truncated PspA proteins that elicited the highest antibody responses, 100% and 
71 % of the challenge strains were protected. RCT1 05 PspA, which elicited the lowest titers of PspA-speclfic antibody, 
yielded protection against 57% of S. pneumoniae strains evaluated. With all truncated PspAs, significant levels of 
protection were observed in four of the seven challenge strains. In fact, in all instances except for on (RCT1 05-immu- 

10 nized mice challenged with strain BG9739) the trend was for truncated PspA-immunization to elicit protection against 
pneumococcal challenge. These results demonstrate that truncated Rx1 PspA (amino acids 192-588) cross-protects 
mice against fatal S. pneumoniae WU2 challenge. More importantly, these data show that the homologous regions of 
diverse PspAs demonstrate comparable cross-protective abilities. 

[0115] Strains of capsular type 4 and 5 were more difficult to protect against than were type 3, 6A and 6B pneumo- 
15 cocoa! strains. Serological differences in PspAs might affect cross-protection In some cases. Yet the difficulty In pro- 
tecting against the type 4 and 5 strains used herein could not be explained on this basis, since the truncated PspA 
immunogens were cloned from the same three type 4 and 5 strains used for challenge. Both PspAs from the type 4 
strains delayed death caused by one or both type 4 challenge strains but neither could prevent death caused by either 
type 4 pneumococcal strain . Moreover, the truncated PspA from the type 5 strain DBL5 elicited protection against death 
20 or delayed death with strains of capsular types 3, 6A and 6B but failed to protect against infection with its donor strain 
or either type 4 challenge strain. 

[0116] There may be several reasons why the truncated PspAs from capsular type 4 and type 5 strains failed to 
protect against death even with their homologous donor S. pneumoniae strains. One possibility is that the type 4 and 
5 strains chosen for study are especially virulent in the XID mouse model. XID mice fail to make antibodies to polysac- 
25 charfdes and are therefore extremely susceptible to pneumococcal Infection with less than 100 CFU of most strains, 
including those of capsular type 3, 4, 5, 6A, and 6B. The increased mouse virulence of types 4 and 5 is apparent from 
the fact that in immunologically normal mice these strains have lower ID^qB and/or are more consistently fatal than 
strains of capsular types 3, 6A, or 6B. 

[0117] Another possibility is that epitopes critical to protection-eliciting capacity with capsular type 4 and 5 strains 
30 are not present in the C-terminal two-thirds of PspA (amino acids 1 92-588), the truncated fragments used for immu- 
nization. The critical epitopes for these strains may be located In the N-terminal two thirds of the alpha-helical region 
of their PspA molecules. Finally, it is also possible that PspA may be less exposed on some S. pneumoniae strains 
than others. Strain Rx1 PspA amino acid sequence does not contain the cell wall attachment motif LPXTGX described 
by Schneewind et at. found in many gram-positive bacteria. Rather, PspA has a novel anchoring mechanism that is 
35 mediated by choline interactions between pneumococcal membrane-associated lipoteichoic acid and the repeat region 
in the C -terminus of the molecule. Electron micrographic examination has confirmed the localization of PspA on the 
pneumococcal surface and PspA-specific MAb data supports the accessibility of surface-exposed PspA. However, it 
is not known whether S. pneumoniae strains differ substantially in the degree to which different PspA regions are 
exposed to the surrounding environment. Nor is It known if the quantity of PspA expressed on the bacterial cell surface 
40 differs widely between strains. 



Tablet. 



pspA recombinant strains categorized by pneumococcal capsular type. 


Capsular Type 


Parent Strains 


Respective 






Recombinant Strains 


3 


WU2, EF10197 


RCT111,RCT137 


4 


BG9739, EF5668 


RCT113, RCT115 




L81905, EF3296 


RCT117, RCT133 


5 


DBL5 


RCT105 


6A 


DBL6A, EF6796 


RCT109, RCT135 


6B 


BG9163, DBL1 


RCT129, RCT131 


14 


TJ0893* 


none* 


19 


BG8090 


RCT121 



Truncated pspA ampWted recently, not yet cloned 
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Table 1. (continued) 



pspA recombinant strains categorized by pneumococcal capsular type. 


Capsular Type 


Parent Strains 


Respective 
Recombinant Strains 


23 


0922134, BG8743 


RCT119, RCT123 



Table 2. 



Description of recombinant strains used in evaluating the protection-eliciting capacity of truncated PspAs in mice. 


Recombinant Strain 


Description 


Capsule Type of Parent PspA 


RCT105 


BL21(DE3) E.collwith 


5 




pET-9a:DBL5 




RCT113 


BL21(DE3) E.colf 


4 




with pET-9a:BG9739 




RCT117 


BL21(DE3) E.coli with 


4 




pET-9a:L81905 




RCT125 


BL21(DE3) E.coli with 






pET-9a (vector only) 





25 



30 



35 



40 



45 



50 
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EXAMPLE 2 - LOCALIZATION OF PROTECTION-ELICITING EPITOPES AND PspA OF S. PNEUMONIAE 

[0118] This Example, the ability of PspA epitopes on two PspA fragments (amino acids 192-588 and 192-299) to 
elicit cross-protection against a panel of diverse pneumococcl is demonstrated. Also, this Example identifies regions 
s homologous to amino acids 1 92-299 of Rx1 In 1 5 other diverse pneumococcal strains. The DNA encoding these regions 
was then amplified and cloned. The recombinant PspA fragments expressed were evaluated for their ability to elicit 
cross-protection against a panel of virulent pneumococcl. 

[0119] Bacterial strains and media conditions . S. pneumoniae strains were grown in Todd Hewitt broth with 0.5% 
yeast extract (THY) (both from Difco Laboratories, Detroit, Michigan) at 37°C or on blood agar plates containing 3% 
10 sheep blood at 37°C under reduced oxygen tension. E. coll strains were grown in Luria-Bertani medium or minimal E 
medium. Bacteria were stored at -80°C In growth medium supplemented with 10% glycerol. £ coti were transformed 
by the methods of Hanahan (Hanahan, D. J. Mol. Biol. 1983; 166: 557). Ampicillin (Ap) was used at a concentration 
of 100 \igfm\ for E.ColL 

[0120] Construction of plN-lll-ompA3 and pMAL-based£ CGffstrains. Recombinant plasmidspBCI 00 and pBAR41 6 
15 that express and secrete pspA fragments from £ Cofi were constructed with plN-lll-ompA3 as previously described 
(McDaniel, LS. et al., Microb, Pathog. 1994; 17: 323). 

[0121] The pMAL-p2 vector (New England Biolabs, Protein Fusion & Purification System, catalog #800) was used 
for cloning pspA gene fragments to amino acids 192-299 from strain Rx1 and from 7 other S. pneumoniae strains: 
R36A, D39, A66, BF9739, DBL5, DBL6A, and LM100. Amplification of the pspA gene fragments was done by the 

20 polymerase chain reaction (PCR) as described previously (McDaniel, L.S. et al., Microb. Pathog. 1994; 17: 323) using 
primers 5X3CGGATCCGCTCAAAGAGATTGATGAGTCTG3' [LSM4] and 5'CTGAGTCGACTGAGTTTCTGGAGCTG- 
GAGC3' [LMS6] made with BamH\ and Sail restriction endonuclease sites, respectively. Primers were based on the 
sequence of Rx1 PspA. PCR products and the pMAL vector were digested with HAMH! and Sali , and llgated together. 
Clones were transformed into E Coli DH5a by the methods of Hanahan. Stable transformants were selected on LB 

25 plates containing 100ng/rn) Ap. These clones were screened on LB plates containing 0.1 mM IPTG, 80 ng/ml X-gal 
and 100 ng/ml Ap and replica LB plates with 100 ng/rnl Ap according to the manufacturer's instructions. The strain 
designations for these constructs are listed in Table 6. Positive clones were evaluated for the correct pspA gene frag- 
ment by agarose gel electrophoresis following plasmid isolation by the methods of Bimboim and Ooly (Bimbolm, H.C. 
et al., Nucl. Acids Res. 1979, 7: 1513). Southern analysis was done as previously described using a full-length pspA 

30 probe (McDaniel, LS. etal., Microb. Pathog. 1994; 17:323), randomly primed labeled with digoxigenin-11 -dUTP (Gen- 
ius System, Boehdinger Mannheim, Indianapolis, Indiana) and detected by chemitumlnescence. 
[0122] Expression of recombinant PspA protein fragments . For induction of expression of strains BC100 and 
BAR416, bacteria were grown to an optical density of approximately 0.6 at 660 nm at 37° C in minimal media, and IPTG 
was added to a final concentration of 2 mM. The cells were incubated for an additional 2 hours at 37°C, harvested, 

35 and the periplasm^ contents released by osmotic shock. For strains BAR36A, BAR39, BAR66, BAR5668, BAR9739, 
BARL5, BAR8A and BAR1 00, bacteria were grown and induced as above except LB media + 1 0 mM. glucose was the 
cultu re medium. Proteins from these strains were purified over an amylose resin column according to the manufacturer's 
instructions (New England Biolabs, Protein Fusion & Purification System, Catalog #800). Briefly, amylose resin was 
poured into a 10 mL column and washed with column buffer. The diluted osmotic shock extract was loaded at a flow 

40 rate of approximately 1 mL/minute. The column was then washed again with column buffer and the fusion protein eluted 
off the column with column buffer containing 1 0 mM maltose. Lysates were stored at -20°C until further use. 
[0123] Characterization of truncated PspA proteins used for immunization . The truncated PspA molecules, controls 
and molecular weight markers (Bio-Rad, Richmond, CA) were electrophoresed in a 10% sodium dodecyl (SDS) - 
polyacrylamide gel and electroblotted onto nitrocellulose. Rabbit polyclonal anti-PspA serum and rabbit antfmattose 

45 binding protein were used as the primary antibodies to probe the blots. 

[01 24] A direct binding ELISA procedure was used to quantitatively confirm reactivities observed by immunoblotting. 
For all protein extracts, osmotic shock preparations were diluted to a concentration of 3 jig/ml in phosphate buffered 
saline (PBS), and 100 \i\ was added to the wells of Immulon 4 microtttration plates (Dynatech Laboratories, Inc., Chan- 
tilty, VA). After blocking with 1 .5% bovine serum albumin in PBS, unfractlonated tissue culture supemates of individual 

so MAbs were tttered in duplicated by three-fold serial dilution through seven wells and developed using an alkaline phos- 
phatase-labeled goat anti-mouse immunoglobulin secondary antibody (Southern Biotech Associates, Birmingham, AL) 
and alkalinephosphatase substrate (Sigma, St. Louis, MO). The plates were read at 405 nm in a Dynatech plate reader 
after 25 minutes, and the 30% end point was calculated for each antibody with each preparation. 
[0125] Immunization and Protection Assays . Six to nine week old CBA/CAHN-XID/J (CBA/N) mice were obtained 

55 from the Jackson Laboratory, Bar Harbor, Maine. CBA/N mice carry an X-linked immunodeficiency trait, which renders 
them relatively unable to respond to polysaccharide antigens, but they do respond with normal levels of antibodies 
against protein antigens. Because of the absence of antibodies reactive with the phosphocholine determinant of C- 
polysaccharide in their serum, the mice are highly susceptible to pneumococcal infection. Mice immunized with the 
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BC1 00 fragment were injected inguinally with antigen emulsified in CFA, giving an approximate dose of 3 \ig of protein 
per mouse. Fourteen days later the mice were boosted intraperitoneally with 3 jig of antigen diluted in Ringer's lactate 
without adjuvant. Control mice were immunized following the same protocol with diluent and adjuvant, but no antigen. 
Mice immunized with the BAR41 6 fragment were injected with 0.2 ml at two sites in the sublinguinal area with antigen 

s emulsified in CFA . The mice were boosted inguinally fourteen days later with antigen emulsified in I FA and were boosted 
a second time fourteen days later intraperioneally with 0.2 ml of antigen diluted in Ringer's lactate without adjuvant. 
[0126] Mice that were immunized with the homologues of Rx1 BAR41 6 were immunized as described above. The 
control animals followed the same immunization protocol but received maltose binding protein (MBP) diluted 1:1 in 
CFA for their Immunization and were also boosted with MBP. 

10 [0127] Serum analysis . Mice were retro-orbitalry bled with a 75 \i\ heparin Ized mlcrohematocrit capillary tube (Fisher 
Scientific) before the first immunization and then once approximately 2 hours before challenge with virulent pneumo- 
cocci. The serum was analyzed for the presence of antibodies to PspA by an enzyme-linked immunosorbent assay 
(ELISA) using native full-length R36A PspA as coating antigen as previously described (McDaniel, L.S. Mlcrob. Pathog. 
1994:17:323). 

is [0128] Intravenous infection of mice . Pneumococcal cultures were grown to late log phase in THY. Pneumococci 
were diluted to 10 4 CFU based on the optical density at 420 nm into lactated Ringer's solution. Seven days following 
the last boost injection for each group, diluted pneumococci were Injected intravenously (tail vein) in a volume of 0.2 
ml and plated on blood agar plates to confirm the numbers of CFU per milliliter The final challenge dose was approx- 
imately 50-1 00 times the LD^ of each pneumococcal strain listed in Tables 4-6. The survival of the mice was followed 

20 for 21 days. Animals remaining alive after 21 days were considered to have survived the challenge. 

[0129] Statistical analysts . Statistical significance of differences in days to death was calculated with the Witeoxon 
two-sample rank test. Statistical significance of survival versus death was made using the Fisher exact test. In each 
case, groups of mice Immunized with PspA containing preparations were compared to unimmunized controls, or con- 
trols immunized with preparations lacking PspA. One-tailed, rather than two-tailed, calculations were used since Im- 

25 munization with PspA or fragments of PspA has never been observed to cause a statistically significant decrease in 
resistance to infection. 

[0130] Cloning, into pMAL vector. Using primers based on the sequence of Rx1 PspA, LSM4 and LSM6, pspA gene 
fragments were amplified by PCR from fifteen out of fifteen pneumococcal strains examined. Seven of the eleven gene 
fragments were cloned Into pMAL-p2 and transformed into E coli (Table 6). The correct insert for each new clone was 

30 verified by agarose gel electrophoresis and Southern hybridization analysis. Plasmids from recombinant E. caff strains 
BAR36A, BAR39, BAR66, BAR9739, BARL5, BAR6A and BAR100 were isolated, digested with BamH\ and Sail re- 
striction endonucleases and elect rophoresed on a 0.7% TBE agarose gel. The gel was then denatured and the DNA 
transferred to a nylon membrane for southern hybridization. The blot was probed with full-length Rx1 pspA DNA at 
high stringency conditions. The cloning of R36A, D39, A66, BG9739, DBL5, DBL6A and LM1 00 pspA DNA into pMal- 

& p2 was confirmed by the recognition of all BamH\ and SaA digested DNA inserts by the Rx1 probe. 

[0131] Expression and conformation of truncated recombinant proteins. The transformed E coli strains BAR36A, 
BAR39, BAR66, BAR9739, BARL5, BARGA and BAR1 00 were grown in LB media supplemented with 1 0 mM glucose 
and induced with 2 mM IPTG for expression of the truncated PspA protein fused with maltose binding protein. Trans- 
formed E coli strains BC1 00 and BAR41 6, which express PspA fragments fused to the OmpA leader sequence In the 

40 p I N- 1 1 1 -omp A3 vector, were grown in minimal medium and induced with 2 mM IPTG for expression. Both vectors, pIN-lll- 
ompA3 and pMal-p2, are vectors in which fusion proteins are exported to the periplasmic space. Therefore, an osmotic 
shock extract from the pMal-p2 containing bacteria was then run over an amylose column for purification and resolved 
by SDS-PAGE western blotting. The western blot of the protein extracts from BAR36A, BAR39, BAR66, BAR9739, 
BARL5, BAR6A and BAR100 were recognized by a rabbit polyclonal antibody made to strain BC100 PspA. The ap- 

45 parent M r of full-length PspA from WU2 is 91.5 kD. The M r of maltose binding protein is 42 kD and the expected M r 
for the PspA portion of the fusion is 12 kD. All extracts exhibited molecular weights that ranged from 54 to 80 kD. This 
range of molecular weights can be attributed to the variability of pspA among different pneumococcal strains. An ELISA, 
with plates coated with the various cloned fragments quantitatively confirmed the reactivities that were observed in the 
western blots with alt protein extracts. 

so [01 32] Protection and cross-protection against fatal pneumococcal infection elicited by cloned PspA fragments . CBA/ 
N mice were immunized with the truncated PspA fragment encoded by p8C100, which is composed of amino acids 
1 92 to 588 of Rx1 PspA, and challenged with 1 3 different S. pneumoniae strains representing 7 different capsular types 
(Table 4). With all 1 3 strains, the immunization resulted in protection from death or an extended time to death. With 1 0 
of the strains the difference was statistically significant. With strains of capsular types 3, 6A, and 6B, all immunized 

55 mice were protected against death. Although there were fewer survivors in the case of capsular types 2, 4, and 5, the 
immunization with BC100 resulted in significant increases in times to death. 

[0133] The BC100 immunization studies made it clear that epitopes C-terminal to residue 192 could elicit cross- 
protection. The BAR41 6 fragment, which includes amino acids 1 92-299, could elicit protection from fatal infection with 
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a single challenge strain WU2. This Example shows the ability of BAR41 6 immunization to protect against the 6 strains 
that had been best protected against by immunization with BC100. Immunization with the BAR416 construct resulted 
in increased time to death for all 6 challenge strains examined (Table 5). BAR41 6 provided significant protection against 
death with WU2, A66, BG7322 and EF6796 pneumococd (capsular types 3, 3, 6B and 6A respectively). It also pro- 
s longed the lives of mice challenged with ATCC6303 and DBL6A pneumococcl (capsular types 3 and 6A respectively). 
Serum from mice immunized with the BAR41 6 fragment yielded a geometric mean reciprocal antl-PspA ELISA titer to 
full-length Rx1 PspA of 750. Mice immunized with BC1 00 had geometric mean reciprocal titers of close to 2000, while 
non-immunized mice had anti-PspA titers of <10. 

[0134] The above data indicates that the BAR416 fragment from Rx1 elicits adequate cross-reactive immunity to 
10 protect against diverse pneumococd and suggests that this region must be serologically conserved among PspAs. 
This hypothesis was confirmed by Immunized with recombinant BAR416 homologous regions from the 7 different 
clones and then challenging with strain WU2 (Table 6). All 7 immunogens elicited significant protection. PspA fragments 
from capsular types 2 and 22 and the rough R36A strain elicited complete protection against death with all challenged 
mice. Ail of the other immunogens were able to extend the time to death of all the mice with the median days to death 
being 21 days or >21 days. Serum from mice immunized with the BAR416 homologous fragments had anti-PspA 
reciprocal titers that ranged from 260 to 75,800 with an average of 5700 while control animals immunized with only 
maltose binding protein had anti-PspA reciprocal titers of <1 0. 

[01 35] Antibody reactivities . All of the above immunization studies attest to the cross-reactivity of epitopes encoded 
by amino acids from position 192-299. This region includes the C-terminal third of the a-helical region and the first 

20 amino acids of the proline rich region. Other evidence that epitopes within this region are cross-reactive among different 
PspAs comes form the cross-reactivity of a panel of nine MAbs all of which were made by immunization with Rx1 PspA. 
The epitopes of four of the antibodies in the panel reacted with epitopes mapping between amino acids 192-260. The 
epitopes of the other five MAbs in the panel map between amino acids 1 and 1 1 5 (McDanlel, LS. , et al. , Microb. Pathog. 
1994; 17: 323). Each of these 9 MAbs were tested for its ability to react with 8 different PspAs in addition to Rx1 . The 

25 5 MAbs whose epitopes were located within the first 115 amino acids, reacted on average with only 1 other PspA. 
Three of the 5 in fact, did not react with any of the other 8 PspAs. In contrast the MAbs whose epitopes map between 
192 and 260 amino acids each cross- reacted with an average of 4 of the 8 non-Rx1 PspAs, and all of them reacted 
with at least two non-Rx1 PspAs. Thus, based on this limited section of individual epitopes, it would appear that epitopes 
In the region from 1 92-260 amino acids are generally much more cross-reactive than epitopes in the region from 1-115 

3o amino acids. 

[01 36] The BC1 00 fragment of Rx1 PspaA can elicit protection against the encapsulated type 3 strain WU2. Although 
the PspAs of the two strains can be distinguished serologically they are also cross-reactive (Grain, M.J., et al., Infect. 
Immun. 1 990; 58: 3293). The earlier finding made it clear that epitopes cross-protective between Rx1 and WU2 PspAs 
exist. The importance of cross-reactions in the region C-terminal to residue 1 92 is demonstrated in this Example where 

35 1 3 mouse virulent challenge strains have been used to elicit detectable protection against all of them. The first indication 
that epitopes C-terminal to position 192 might be able to elicit cross-protection came from our earlier study where we 
showed the MAbs XI64, XIR278, XIR1323, and XiR1325, whose epitopes mapped between amino acids 192 and 260 
of strain Rx1 PspA, could protect against infection with WU2. Moreover, immunization with PspA fragments from 
1 92-588 and 1 92-299 were able to elicit protection against infection against WU2. This Example shows that the BC1 00 

40 Rx1 fragment (1 92-588) elicits significant protection against each of 1 3 different mouse virulent pneumococd, thereby 
firmly establishing the ability of epitopes C-terminal to position 1 92 to elicit a protective response. The observation that 
a fusion protein containing amino adds 192-299 fused C-terminal to maltose binding protein could also elicit cross- 
protection, permits the conclusion that epitopes in this 107 amino acid region of PspA are suffident to elicit significant 
cross-protection against a number of different strains. 

45 [01 37] Evidence that a comparable region of other PspAs is also able to el icit cross-protection cam from the studies 
where sequences homologous to the 192-299 region of Rx1 PspA were made from 5 other PspAs. All 5 of these 
fragments elicited significant protection against challenge with strain WU2. These data provide some suggestion for 
serologic differences in cross-protection elicited by the 192-299 region. 

[01 38] Based on present evidence, without wishing to be bound by any one particular theory, it is submitted that the 
so PspAs in strains D39, Rx1 and R36A are Identical. All of the 9 mice immunized with the 1 92-299 fragments from R36A 
and D39 survived challenge with WU2. Only LM1 00, one of the non-R36A/D39 PspAs, protected as high a percentage 
of mice from WU2. The difference in survival elicited by the R36A/D39 PspAs and the non-Rx1 related PspAs was 
statistically significant. 

[0139] The data does indicate however, that all of the differences in protection against different strains are not due 
55 to differences in serologic cross-reactivity. BC100, which is made from Rx1 , protected against death in 100% of the 
mice challenged with 7 different strains of S. pneumonia, but only delayed death with strain 039, which is thought to 
have the same PspA as strain Rx1 . Thus, some of the differences in cross-protection are undoubtedly related to factors 
other than PspA cross-reactivity. Whether such factors are related to differences in virulence of the different strains in 
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the hypersuceptible Xld mouse, or differences In requirements for epitopes N-terminal to amino add 1 92, or difference 
in the role of PspA in different strains Is not yet known. 

[01 40] These results suggest that a vaccine containing only the recombinant PspA fragments homologous with Rx1 
amino acids 1 82-299 is effective against pneumococcal infection. Moreover, the results demonstrate that utility of PspA 
5 a.a. 1 92-299, a.a. 192-260 and DNA coding therefor, e.g. primers N 192 or 588 (variants of LSM4 and LSM2) as useful 
for detecting the presence of pneumococciae by detecting presence of that which binds to the amino acid or to the 
DNA, or which is amplified by the DNA, e.g., by using that DNA as a hybridization probe, or as a PCR primer, or by 
using the amino acids in antibody-binding kits, assays or tests; and, the results demonstrate that a.a. 1 92-299 and a. 
a. 1 92-260 can be used to elicit antibodies for use in antibody-binding kite assays or tests. 
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EXAMPLE 3 - ISOLATION OF PspA AND TRUNCATED FORMS THEREOF AND IMMUNIZATION THEREBY 

[0141] PspA Is attached to the pneumococcal surface through a choline binding site on PspA. This allows for suc- 
cessful procedures for the isolation of FL-PspA. PspA can be released from the surface of pneumococci by elution 
s with 2 percent choline chloride (CC), or by growth In a chemically defined medium (CDM) contained 1 .2 percent CC 
(CDM-ET). Since CDM-ET superatants lack high concentrations of choline, the PspA released into them can be ad- 
sorbed to a choline (or choline analog) column and isolated be elution from the column with 2 percent choline chloride 
(CC). 

[0142] This Example describes the ability to obtain PspA by these procedures, and the ability of PspA obtained by 
10 these procedures to elicit protection in mice against otherwise fatal pneumococcal sepsis. Native PspA from strains 
R36A, RX1 , and WU2 was used because these strains have been used previously in studies of the ability of PspA to 
elicit protective immunity (see, e.g., Examples infra and supra) . The first MAbs to PspA were made against PspA from 
strain R36A and the first cloned fragments of PspA and PspA mutants came from strain Rx1 . Strain Rx1 was derived 
from strain R36A, which was in turn derived from the encapsulated type 2 strain, D39. PspAs from these three strains 
15 appears to be identical based on serologic and molecular weight analysis. Molecular studies have shown no differences 
in the pspA genes of strains D39 t Rx1 , and R36A. The third strain that provided PspA in this Example is the mouse 
virulent capsular type 3 strain WU2. Its PspA is highly cross-reactive with that from R36A and Rx1 , and immunization 
with Rx1 and D39 PspA can protect against otherwise fatal infections with strain WU2, 

20 S. pneumoniae 

[0143] Strains of S. pneumoniae used in this study have been described previously (Table*). Bacteria were grown 
in either Todd-Hewitt broth with 0.5 percent yeast extract (THY), or a chemically defined medium (CDM) described 
previously 32, Serial passage of stock cultures was avoided. Strains were maintained frozen in THY + 20 percent 
25 glycerol and cultured from a scraping of the frozen culture. 

Recovery of PspA from pneumococci 

[0144] PspA is not found In the medium of growing pneumococci unless they have reached stationary phase and 
30 autolysis has commenced 38 . To release PspA from pneumococci three procedures were used. In one approach were 
grow pneumococci 1 00 ml of THY and collect the cells by centrifugation at mid-log phase. The pellet was washed three 
time in lactated Ringer's solution (Abbot Lab. North Chicago, IL), suspended in a small volume of 2 percent choline 
chloride in phosphase buffered saline (PBS) (pH 7.0), incubated for 10 minutes at room temperature, and centrifuged 
to remove the whole pneumococci. From immuboblots with anti Pspa MAb X1126 48 at serial dilutions of the original 
35 culture, the suspended pellet, and the supernatant, it was evident that this procedure released about half of the PspA 
originally present on the pneumococci. Analysis of silver stained potyacrylamide gels showed this supernatant to con- 
tain proteins in addition to PspA 36 . 

[0145] The CDM used in the remaining two procedures was modified from that of Van der Rijn 43 . For normal growth 
ft contained 0.03% CC. To cause PspA to be released during bacterial growth, the pneumococci were grown in CDM 
containing 1 .2 percent choline chloride (CDM-CC), or in CDM containing 0.03 percent ethanolamine and only 0.000,001 
percent choline (CDM-ET). In media lacking a normal concentration of choline the F-antigen and C-potysaccharide 
contain phosphoethanolamine rather than phosphocholine 40 . In CDM-CC and CDM-ET, PspA is released from the 
pneumococcal surface because of its inability to bind to the cholines in the lipoteichoic acids 36 . In addition to releasing 
PspA from the pneumococcal surface, growth in CDM-CC or CDM-ET facilitates PspA isolation by its other effects on 

*s the cell wall. In these media pneumococci do not autotyse 40 , thus permitting them to be grown into stationary phase 
to maximize the yield of PspA. In these media septation does not occur and the pneumococci grown in long chains 38 - 49 . 
As the pneumococci reach stationary phase they die cease making Psp^A, and rapidly settle out. Preliminary studies, 
using serial dilution dot blots to quantitate PspA, indicated that the production of PspA ceases at about the time the 
pneumococci begin to settle out, with the formation of visible strands of the condensed pneumococcal chains. When 

so the pneumococci began to settle out, the medium was recovered by centrifugation at 2900 x g for 20 minutes, and 
filtered with a low protein-binding filter (.45n Nalgene Tissue Culture Filter #158-0045). 

[0146] For growth in CDM-CC or CDM-ET, the pneumococci were first adapted to the defined medium and then, in 
the case of CDM-ET, to very tow choline concentrations. To do this, strains were first inoculated into 1 part of THY and 
9 parts of CDM medium containing 0.03 percent choline and 0.03 percent ethanolamine. After two subsequent sub- 
55 cultures in CDM containing 0.03 percent choline and 0.03 percent ethanoline (0.1 ml of culture +■ 0.9 ml of pre-warmed 
fresh medium), the culture was used to inoculate CDM with only 0.003 percent choline (and 0.03 percent ethanolamine). 
These steps was repeated until the strain would grow in CDM-ET containing 0.000,001 percent choline and 0 .03 percent 
ethanolamine. It was critical that cultures be passed while in exponential growth phase (at about 10 7 CFU.ml). Even 
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trace contamination of the medium by exogenous choline resulted in the failure of the PspA to be released from the 
pneumococcal surface 36 . Thus, disposable plastic ware was used for the preparation of CDM-ET media and for growth 
of cultures. Once a strain was adapted to CDM-ET it was froze in 80 percent CDM-ET and 20 percent glycerol at • 
80°C. When grown subsequently the strain was inoculated directly into the CDM-ET 

5 

Isolation of native (full-length) PspA 

[0147] PspA was isolated from the medium of cells grown in CDM-ET using choline-Sepharose prepared by conju- 
gating choline to epoxy-activated Sepharose 50 . A separate column was used for media from different strains to avoid 

io cross-contamination of their different PspAs. For isolation of PspA from clarified CDM-ET, we used a 0.6 ml bed volume 
of choline-Sepharose. The column bed was about 0.5 cm high and 1 .4 cm in diameter. The flow rate during loading 
and washing was approximately 3 ml/min. After loading 300 ml CDM-ET supernatant, the column was washed 1 0 times 
with 3 ml volumes of 50 mM Tris acetate buffer, pH 6.9 containing 0.25 M NaCI (TAB). The washed column was eluted 
with sequential 3 ml volumes of 2 percent CC in TAB. Protein eluted from the column was measured (Bio-Rad protein 

is assay, Bio-Rad, Hercules, CA). The column was monitored by quantitative dot blot. The loading material, washes, and 
the eluted material were dot blotted (1 til) as undiluted, 1/4, 1/16, 1/64, 1/256, and 1/1024 on nitrocellulose. The mem- 
branes were then blocked with 1 percent BSA in PBS, incubated for 1 hr with PspA-spedfic MAbs XI126 or XIR278, 
and developed with biotinylated goat-anti-mouse Ig, alkaline phosphatase conjugated streptavidin (Southern Biotech- 
nology Associates Inc. Birmingham, AL), and nitrobluetetrazolium substrate with 5-bromo 4-chloro-3-indoyl phosphate 

20 p-toluidine salt (Fisher Scientific, Norcross GA) 17 . The purity of eluted PspA was assessed by silver-stained (silver 
stain kit, Bio Rad, Hercules, CA) SDS-PAGE gels run as described previously 32 . Immunoblots of SDS-PAGE gels were 
developed with MAbs Xi126 and XiR278". 

Isolation of 29 kDa PspA 

25 

[0148] The 29 kDa fragment comprising the N-terminal 260 amino acids of PspA was produced In DH1 E. co// from 
PJY4306 31 • 37 . An overnight culture of JY4306 was grown In 1 00 ml of Lurla Broth (LB) containing 50*ig/ml ampiclllln. 
The culture was grown at 37°C in a shaker at 225 rpm. This culture was used to inoculate 6 one liter cultures that were 
grown under the same conditions. When the culture O.D. at 600 nm reached 0.7, 12 grams of cells, as a wet paste, 

30 were harvested at 4°C at 1 2,000 xg. The pellet was washed In 1 0 volumes of 25 mM Tris pH 7.7 at 0°C and suspended 
In 600 ml of 20% sucrose, 25 mM Tris pH 7.7 with 1 0 mM ethylenediamine tetraacetic acid (EDTA) for 1 0 minutes. The 
cells were pelleted by centrlf ugation (8000 xg) and rapidly suspended in 900 ml of 1 percent sucrose with 1 mM Pef abloc 
SC hydrochloride (Boehrlnger Mannheim Corp., Indianapolis, IN.) at 0° C. The suspension was pelleted at 8000 xg at 
4°C. The precipitated from the periplasmic extract by 70 percent saturated ammonium sulfate overnight at 4°C for 30 

35 minutes. The precipitated protein was resuspended in 35 ml of 20 mM histldine 1 percent sucrose at pH 6.6 (HSB). 
Insoluble materials were removed at 1 ,000 xg at 4°C for 1 0 minutes. The clarified material was dialyzed versus HSB, 
passed through a 0.2}xm filter and further purified on a 1 ml MonoQ HR 5/5 column (Pharmacia Biotech, Inc., Piscataway, 
N.J.) equilibrated with HSB. The clarified material was loaded on the column at 1 ml/min, and the column was washed 
with 1 0 column volumes of HSB. The column was then eluted with a gradient change to 5 mM NaCI per minute at a 

40 flow rate of 1 ml/min. As detected by immune blot with Xi126, SDS-PAGE and absorbance, PspA eluted as a single 
peak at approximately 0.27 to 0.30 M NaCI. By SDA-PAGE the material was approximately 90 percent pure. The yield 
from 6 liters of culture was 2 mg (Bio-Rad protein assay) of recombinant PspA. 

Growth of pneumococci for challenge 

45 

[0149] Mice were challenged with log-phase pneumococci grown in THY. For challenge, the pneumococci were di- 
luted directly into lactated Ringer's without prior washing or centrifugation. In inject the desired numbers of pneumo- 
cocci, their concentration in lactated Ringer's solution was adjusted to an O.D. of about 0.2 at 420 nM (LKB Ultrospec 
III spectrophotometer). The number of pneumococci present was calculated at 5x1 0 8 CFU per ml/O.D. and confirmed 
so by colony counts (on blood agar) of serial dilutions of the inoculum. 

Immunization, challenge, and bleedlna of mice 

[0150] CBA/CAHN/XID/J (CB A/N) and BALB/cByJ (BALB/c) mice were purchased from Jackson Laboratory Bar 
55 Harbor, ME. Mice were given two injections two weeks apart and challenged Lv. two weeks later. Injections without 
CFA were given intrapertionealry in a 0.1 ml of Ringers. Where indicated, the first injection was given in complete 
Freuncfs adjuvant (CFA) consisting of approximately a 1 :1 emulsion of antigen solution and CFA oil (Difco, Detroit Ml). 
Antigen in CFA was injected inguinally in 0.2 ml divided between the two hind legs. All mice were boosted t/p. without 
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adjuvant. When mice were injected with media supematants or 2 percent choline chloride eluates of whole bacteria, 
the amounts of material injected were expressed as the volume of media from which the injected material was derived. 
For example, if the clarified medium from pneumococci grown in CDM-CC or CDM-ET was used for immunization 
without dilution or concentration, the dose was described as 1 00 uJ. If the material was first diluted 1/1 0, or concentrated 

5 1 0 fold, the dose was referred to as 1 0 or 1 000 uJ respectively. 

ELISA for antibodies of PspA 

[0151] Specific modifications of previously reported ELISA conditions 17 , are described. Microtitration plates (Nunc 
10 Maxisorp, P.G.C. Scientific, Gafthersburg MD.) were coated with undiluted supematants of Rx1 and WG44.1 pneumo- 
cocci grown in CDM-ET or 1 percent BSA In PBS. Mice were bled retro-orbitally (75 jil) in a heparanized capillary tube 
(Fisher Scientific, Fair Lawn, N.J.) The blood was immediately diluted in 0.5 ml of one percent bovine serum albumin 
in PBS. The dilution of the resultant sera was 1/15 based on an average hematocrit of 47 percent. The sera were 
diluted in 7 three fold dilution in microtitration wells starting at 1/45. Mab XI126 was used as a positive control. The 
15 maximum reproducible O.D. observed with XM26 was defined as •maximum 0.0." The O.D. observed in the absence 
of Immune sera or MAb was defined as "minimum O.D." Antibody titers were defined as the dilution that gives 33 
percent of maximum O.D. The binding to the Rx1 CDM-ET coated plates was shown to be PspA-spedfic, since in no 
case did we observe >33 percent of maximum binding of immune sera or Xi1 26 on plates coated with WG44.1 CDM-ET 
or BSA. 

20 

Statistical analysis 

[01 52] Unless otherwise indicated P values refer to comparisons using the Wiboxin two-sample rank test to compare 
the numbers of days to death in different groups. Mice alive at 21 days were assigned a value of 22 for the sake of 

25 calculation. P values of >0.05 have been regarded as not significant. Since we have never observed immunization 
with PspA or other antigens to make pneumococci more susceptible to infection the P values have been calculated as 
single tailed tests. To determine what the P value would have been if a two tailed test had been used the values given 
should be multiplied by two. In some cases P values were given for comparisons of alive versus dead. These were 
always calculated using the Fisher exact test. All statistical calculations were carried out on a Macintosh computer 

30 using lnStat(San Diego, CA). PspA is the major protection-eliciting component released from pneumococci grown in 
CDM-ET or CDM-CC, or released from conventionally grown pneumococci by elution with 2% CC. 
[0153] PspA-containing preparations from pneumococci were able to protect mice from fatal sepsis following i.v. 
challenge with 3 x 1 0 3 (1 00 x LD M ) capsular type 3 S. pneumoniae (Table 9). Comparable preparations from the strains 
unable to elicit protection. Regardless of the method of isolation the minimum protective dose was derived from pneu- 

6 mococci grown in from 1 0-30 til of medium. We also observed 9 that supematants of log phase pneumococci grown 
in normal THY or COM media could not elicit protection (data not shown). This finding is consistent with earlier 
studies 36 * 37 indicating the PspA is not normally released in quantity into the medium of growing pneumococci. 

Isolated PspA can elicit protection against fatal infection 

40 

[0154] Although PspA was necessary for these preparations to elicit protection it was possible that it did not act 
alone. Mice were thus, immunized with purified FL-PspA to address this question. 

Isolation of FL-PspA from CDM-ET growth medium 

45 

[0155] We isolated the FL-PspA from CDM-ET rather than from CDM-CC medium or a 2 percent choline chloride 
elution of live cells, because the high levels of choline present in the latter solutions prevents adsorption of the PspA 
to the choline residues on the choline-Sepharose column. PspA for immunization was isolated from strain R36A, as 
the strain is non-encapsulated and the isolated PspA could not be contaminated with capsular polysaccharide. As a 

so control we have conducted mock Isolations from WG44.1 since this strain has an Inactivated pspA gene and produces 
no PspA. The results shown in Table 10 are typical of Isolations from 300 ml of CDM-ET medium from R36A grown 
pneumococci. We isolated 84 u.g of PspA from 300 ml of medium, or about 280 ng/liter. Based on the dot blot results 
this appears to be about 75% of the PspA in the original medium, and that CDM-ET from R36A cultures contains about 
400 iig/liter of PspA, or about 0.4 u.g/mt. 

55 [0156] No serologically detectable PspA was seen in the CDM-ET from WG44.1 cultures. More significantly there 
was undetectable protein recovered from the choline-Sepharose column after adsorption of CDM-ET from a WG44.1 
culture, indicating that PspA is the only protein that could be isolated by this procedure. Moreover, by silver stained 
SDS PAGE gel the PspA isolated from R36A appeared to be homogenous (Figure 3). Although autolysin can also be 
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isolated on choline-Sepharose 20 " 50 , we did not expect it to be isolated by this procedure since autotysin is not released 
from pneumococci grown in choline deficient medium 36 . The immunologic purity of the isolated PspA was emphasized 
by the fact that immunization with it did not elicit any antibodies detectable on plates coated with CDM-ET supematants 
ofWG44.1. 

5 [0157] Loading more than 300 ml on the 0.6 ml bed volume column did not result in an increased yield, which sug- 
gested that the column capacity had been reached. However, increasing the depth of the choline-Sepharose bed to 
greater than 0.5 cm, decreased the amount of PspA eluted from the column, presumably because of nonspecific trap- 
ping of aggregates in the column matrix. The elution buffer contains 50 mM Tris acetate 025 M NaCI and 2% choline 
chloride. Elution without added NaCI or with 1 M NaCI resulted in lower yields . Elution with less than 1 % CC also reduced 

10 yields. 

Immunization o fmice with purified R36A PspA 

[0158] For immunization we used only the first 3 ml fraction of the R36A column. Mice were immunized with two 
is injections of 1 , 0.1 , or 0.01 \ig o f R36A PspA, spaced two weeks apart. As controls, some mice were inoculated with 
comparable dilutions of the first 3 ml fraction from the WG44.1 column. Purified FL-PspA elicited antibody to PspA at 
all doses regardless of whether CFA was used as an adjuvant (Table 11). In the absence of CFA the highest levels of 
antibody were seen with the 1 \ig dose of PspA. In the presence of CFA, however, the 0.1 u.g dose was as immunogenic 
as the 1 jig dose. 

20 [01 59] To test the ability of the different doses of the different doses of PspA to elicit protection against challenge we 
Infected the immunized mice with two capsular type 3 strains, WU2 and A66. Although both of these strains are able 
to kill highly susceptible CBA/N XID mice at challenge doses of less than 10 2 , the A66 strain is several logs more 
virulent when BALB/c mice are used 47 - 52 . The difference in virulence of A66 and WU2, was partially compensated for 
by challenging the immunized CBA/N mice with lower doses of strain A66 than WU2. 

25 [0180] After immunization of CBA/N mice with 1 and 0.1 }ig doese of PspA we observed protection against WU2 
challenge regardless of whether or not CFA was used as an adjuvant (Table 4). At the lowest dose, 0.01 \ig PspA, 
most of the mice immunized with PspA -f CFA lived whereas most Immunized with PspA alone did not; however, the 
difference was not statistically significant. When immunized mice were challenged with the more virulent strain 
A66 47 < 53 , survivors were only observed among mice immunized with the 1 and 0.1 jig doses. There was slightly more 

30 protection against fatal A86 Infection among mice immunized with CFA than without, but the difference was not statis- 
tically significant. When the two sample rank test was used to analyze the time to death of mice infected with A66 we 
observed a statistically significant delay in the time to death ineach immunized group as compared to the pooled 
controls. 

35 The 29 kDa N-termlnal fragment of PspA can elicit protection against infection when injected with CFA 

[0161 ] We have compared the immunogenicity, with and without CFA, of an isolated 29 kDa fragment composed of 
the first 260 amino acids of PspA. Unlike the case with FL-PspA, adjuvant ws required for the 29 kDa fragment to elicit 
a protective response. This was observed even though the immunizing doses of the 29 kDa antigen used were 1 0 and 
40 30 u.g/mouse, or about 100 and 300 times the minimum does of FL-PspA that can elicits protection in the absence of 
adjuvant. 

Injection with CFA revealed the presence of additional protection eliciting antlgen(s) in CDM-CC, and CDM-ET growth 
medium but not in the 2 percent choline chloride eluates of live cells 

45 

[0162] The observation that Freud's adjuvant could have such a major effect on the immunogenicity of the 29 kDa 
fragment (Table 12), prompted us to reexamine the immunogens described In Table 2 to determine if immunization 
with adjuvant might enhance protection elicited by PspA-containing preparations or provide evidence for protection 
eliciting antigens In addition to PspA. By using CFA with the primary injection, the does of PspA-containing growth 

so medium (CDM-CC and CDM-ET) required to elicit protection was reduced fro 1 0-30 uJ (Table 9) down to 1 to 3 u,l (Table 
1 3). When CFA was used as an adjuvant with CDM-CC and CDM-ET from PspA- strains WG44. 1 and JY1 1 1 9 we were 
able to elicit protective immune responses if material from £1 00}U or more of media were injected. Thus, although there 
were apparently some protection eliciting components other than PspA in CDC-CC and CDM-ET growth media, PspA 
remained the major protection eliciting component even in the presence of adjuvant. 

55 [0163] One of the media used for injection was CDM-ET in which JY2141 had been grown. This medium elicited 
protection against WU2 challenge even when injected at doses as low as 1 \il It should be noted that although this 
strain does not make full-length PspA, it secretes a truncated molecule comprising the first 115 amino acids of PspA 
into the growth medium. Thus, unlike CDM-ET from WG44.1 and JY1119, CDM-ET from JY2141 with 2 percent CC 
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was relatively non-lmmunogenic even when emulsified with CFA. This result Is consistent with the fact that the 115 
amino add N4ermlnal PspA fragment of JY2141 Is not surface attached 37 , and would be expected to be washed away 
prior to the elution with 2 percent CC. 

5 Extension of studies to BALB/c mice l.p. challenge route 

[0164] The studies above all involve l.v. challenge of CBA/N mice expressing with the XID genetic defect. The i.v. 
route, used in the present studies provides a relevant model for bacteremia and sepsis, but pneumococci have higher 
LDste when injected i.v. than i.p. CBA/N mice are hypersusceptible to pneumococcal infection because of the XIC 

10 defect. This genetic defect prevents them from having circulating naturally occurring antibody to phosphocholine. The 
absence of these antibodies have been shown to make XID mice several logs more susceptible to pneumococci than 
isogenic mice lacking the immune defect. From the data in Table 14 it is clear, however, that immunization with PspA 
can protect against infection In mice lacking the XID defect even when the challenge Is by the i.p. route. Thus, there 
is no reason to suspect that the results presented are necessarily dependent on the use of the CBA/N XID mouse or 

*5 the i.v. route. 

PspA Is highly immunogenic 

[0165] These studies provide the first quantitative data on the amount of purified FL-PspA that is required to elicit 
20 protective immunity in mice. The isolated PspA for these studies was obtained by taking advantage of the fact that the 
C-terminal half of PspA binds to cell surface choline 36 . The isolated FL-PspA was found to be highly immunogenic in 
the mouse. Only two injections of 100 ng of PspA in the absence of adjuvant were reqired to elicit protection against 
otherwise fatal sepsis with greater than 1 00 Ldgo of capsulat type 3 S, pneumoniae. When the first injection was given 
with adjuvant, doses as small as 10 ng could elicit protective response The potent immunogenicity of PspA, and the 
25 ability to isolate It on choline-Sehparose columns provides a demonstration for the possible use of PspA as a vaccine 
in humans. 

[01 66] A large body of published 17 - ^ 37 as well as unpublished evidence indicates that the major protection eliciting 
epitopes of PspA are located in the a-helical (N-terminal) half of the molecule. From the present studies, it is clear that 
immunization with N-terminal fragments containing the first 115 or 260 of the 288 amino acid a-helical region are able 

30 to elicit protection when given with CFA. However, these fragments were not able to elicit protective responses without 
CFA. In the case of the both the 115 and 260 amino acid fragments, even Immunization at 100 times the minimum 
dose that is immunogenic for FL-PspA failed to elicit a protective response. This result is consistent with previous 
results showing that a fragment composed of the N-terminal 245 amino acids 31 • 37 could elicit protection against oth- 
erwise fatal pneumococcal infection of mice when the immunization was given with CFA 32 . In that study no immunization 

33 without CFA was attempted . Even though th e C-termin al half of PspA may not contain major protection-eliciting epitopes 
it appears to contain sequence important in the immunogenicity of the molecule as a whole, since the full length mol- 
ecule elicited much greater protection than the N-terminal fragments. The effect of the C terminal half on antigenicity 
may be in part that it doubles the size of the immunogen. Molecules containing the C-terminal half of PspA may also 
be especially immunogenic because they exhibit more extensive aggregation than Is seen with fragments expressing 

40 only the a-helical region 38 . Protein aggregates are known to generally be more antigenic and less tolerogenic than 
individual free molecules 54 . 

PspA is the major protection eliciting component of our pneumococcal extracts 

45 [0167] Evidence that PspA is the major protection eliciting component of the CDM-ET, CDM-CC growth media and 
the two percent CC etuates was dependent on the use of mutant pneumococci that lacked the ability to produce 
FL-PspA. More than one pspA mutant strain was used to insure that the failure to elicit protection in the absence of 
FL-PspA was not a spurious result of non-PspA mutation blocking the production of some other antigen. Strains WG44.2 
and JY1119 contain identical deletions that include the 5' end of the pspA genes and extend about 3 kb upstream of 

so pspA 37 . WG44. 1 1s a mutant of the non-encapsulated strain Rx1 and JY1 119 was made by transforming capsular type 
3 strain WU2 with the WG44.1 pspA mutation. In no case were preparations from WG4.1 and JY1119 as efficient at 
eliciting protection as those from the PspA+ strains. To rule out the possibility that protection elicited by preparations 
from the PspA+ strains was elicited by some non-PspA molecule also encoded by a 3kb deletion linked to the mutant 
pspA genes of WG44.1 and JY1119, we also used strains JY2141 and LM34 26 * 37 . In these strains the Rx1 pspA gene 

55 has been insertionalty inactivated causing the production of N-terminal fragments of 1 1 5 and 245 amino acids respec- 
tively. These strains have no other known mutations. Although Rx1 and R36A are closely related non-encapsulated 
strains, some of the studies included Rx1 as the PspA+ control since it is the Isogenic partner to WG44.1 , LM34, and 
JY2141 . The N terminal fragments produced by JY2141 and LM34 tack the surface anchor and are secreted into the 
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medium 36 . IWo percent CC eluates of JY2141 were non-protection eliciting even In the presence of adjuvant. In the 
absence of adjuvant, CDM-ET from JY2141 was not protection-eliciting. LM34 was tested without CFA in only 3 mice, 
but gave results consistent with those obtained with JY2141 . 

[0168] Anticapsutar antibodies are known to be protective against pneumococcal infection 5 « 19 . However, in these 
5 studies it is unlikely they account for any of the protection we attributed to PspA. Our challenge strain bore the type 3 
capsular polysaccharide and our primary source of PspA was strain R36A, which is a spontaneous non-encapsulated 
mutant of a capsular type 2 strain 39 - 41 . The 36A strain has been recently demonstrated to lack detectable type 3 
capsule on the surface or in its cytoplasm 55 . Furthermore, the CBA/N mice used in most of the studies are unable to 
make antibody responses type 3 polysaccharide 

10 

Non-PspA protection components 

[01 09] The observalton that CDM-CC and CDM-ET supematants of WG44.1 could elicit protection when Injected in 
large amounts with adjuvant, suggested that these supematants contained at least trace amounts of non-PspA pro- 
fs tection eliciting molecules. In the case of preparations containing PspA eluted from the surface of live washed pneu- 
mococci with 2 percent CC, there was no evidence for any protection eliciting components other than PspA, presumably 
because the protection-eliciting non-PspA proteins released into the media were removed by the previous washing 
step. The identity of the protection eliciting molecules in the WG44.1 supernatant are unknown. In this regard, it is of 
interest that unlike R36A, strain Rx1 has been shown to contain a very small amount of cytoplasmic type 3 polysac- 
20 charide (but totally lacks surface type 3 polysaccharide This difference from Rx1 apparently came about through 
genetic manipulations in the contruction of Rx1 from R36A 3941 . Thus, preparations made from Rx1 or from its daughter 
strains WG44. 1 , LM34, or JY21 41 , could potentially contain small amounts of capsular polysaccharide. For a number 
of reasons however, it seems very unlikely that the non-PspA protection-eliciting material identified In these studies 
was type 3 capsular polysaccharide (expressed by the WU2 challenge strain: 1) growth of these strains was either in 
25 CDM-CC or CDM-ET, each of which prevent autolysin activity and lysis 57 that would be required to release the small 
amount of type 3 polysaccharide from the cytoplasm of the Rx1 family strains; 2) CBA/N mice made protective re- 
sponses to the non-PspA antigens, but express XID immune response deficiency which permits responses to proteins, 
but blocks antibody to most polysaccharides 46 , including type 3 capsular polysaccharide a ; and 3) immunogenicity 
of the non-PspA component required CFA, an adjuvant known to stimulate T-dependent (protein) rather than T-inde- 
30 pendent (polysaccharide) antibody responses. 

[0170] A number of non-PspA protection eliciting pneumococcal proteins have been identified: pneumolysis au- 
tolysin, neuraminidase, and PspA which are 52, 36.5, 1 07 and 37 kDa respectively 21 • 38 ' 59 ' 60 . The non-PspA protection 
eliciting components reported here could be composed of a mixture of these and/or other non-identified proteins. At- 
tempts to identify lambda clones producing non-PspA protection eliciting proteins as efficacious as PspA have not 
35 been successful 25 . 

Isolation of PspA 

[0171] The protection capacity CDM-CC, CDM-ET and material eluted from live cells with 2% CC were similar In 
40 terms of the volume of the original culture from which the injected dose was derived. The major advantage of eluting 
the PspA from the surface of pneumococci with 2 percent CC is that the pneumococci may be grown in any standard 
growth medium, and do not have to be first adapted to a defined medium. Moreover, concentration of PspA can be 
accomplished by centrifugation of th epneumococd prior to the elutbn of the PspA. An advantage of using either 
CDM-CC and CDM-ET media was that these media prevented lysis and pneumococci could be grown into stationary 
45 phase without contaminating the preparations with cytoplasmic contents and membrane and wall components. A par- 
ticular advantage of CDM-ET growth medium Is that since it lacks high concentrations of choline of PspA contained in 
It can be adsorbed directly to a chollne-Sepharose column for affinity purification. 

[0172] One liter of CDM-ET growth medium contains about 400 u.g of PspA, and we were able to isolate about 3/4 
of It to very high purity. At 0.1 u,g/dose, a liter of CDM-ET contains enough PspA to immunize about 4,000 mice or 
so possibly 40-400 humans. Our present batch size for a single column run is only 300 ml of CDM-ET. This could pre- 
sumably be increased by increasing the amount of the adsorbent surface by increasing the diameter of the column. 
Using our present running buffer we have found that a choline-Sepharose resin depth of 0.5 cm was optimal; increases 
beyond 0.5 cm caused the overall yield to decrease rather than increase, even in the presence of larger loading of 
R36A CDM-ET. 

55 
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Table 9 - 





PspA Is the major protection-eliciting component In antigen preparations made by three different methods 


5 


Preparation 


Strain (PspA 


Dose as volume 


Median Days 


Alive: Dead 


P versus 






Status) 


of media in 


Alive 




controls** 

1 




2% CC eluate 


R36A 


1000 


>21 


2:0 






from live 


(PspA + ) 


200 


>21 


2:0 




10 


cells 




on 


>t i 










o 


1 r 

1 .o 












all 

all rk3Do 




Out 








JY2141 


1000 


3,>21 


1:1 




15 




(aa1-115) 


200 


1 


02 










on 


1 


n-o 








nx i 


inn 


>£ 1 


CV.V 




20 


ctanriea 


(r J SpA T ) 




1 


o»n 
<i.0 




medium 




10 


2 


1:2 










3 


2 


0:3 










Ail 


2 ( >21 


12:6 


0.0004 


25 






1 nn 
1 uu 












WG44.1 


100 


2 


0:9 








(pspA + ) 


30 


2 


0:3 










■# n 
10 




n»o 














v/.O 




30 


















WU2 


1000 


>21 


3:0 


0.05 






(pspA + ) 


100 








35 






A 1 1 

ALL 


>21 


>t .n 
4.0 


n no 






JY1119 


1000 


4 


0:3 








(papA + ) 
















100 




n«o 




40 
















CDM-ET 


R36A 


100 


>21 


8:0 


AAA 4 

<0.0001 




ciarmeo 


(pSpA + ) 


in 
1U 




o.o 


n nnA 




medium 




1 


1.5 


3:5 




45 






0.1 


2 


0:2 










ALL 


>21 


16:12 


0.006 






JY2141 


100 


1.5 


0:2 








(aa1-115) 


10 


1.5 


05 




50 


















WG44.1 


100 


3 


02 








(pspA+) 


10 


1.5 


0:2 




55 


None 






2 


0:14 















a Antigen dose b given as the volume of growth media from which the 0.1 ml of Injected material was derive. Each mouse was Injected twice i.p. 
with the indicate doe diluted as necessary tn tactated Ringer's Injection solution. 

b Controls used for statistical comparisons: 2% CC. all JY2141 ; COM-CC Rx1 . all WG44.1; CDM-CC WU2, JY2141 + alt JY2141 . 
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45 Table 13 



50 


PspA is not the only protection eliciting molecule released from pneumococci by interference with binding to choline 

on the surface of pneumococci 


Preparation 


Strain (PspA 
status) 


Dose (as 
volume in ul) 


Median Day 
Alive 


Alive: Dead 


P values 3 


Pvs. ail JY2141 




2% CC eulate 


R36A 


1000 


>21 


2:0 


0.02 




from live cells 


(PspA+) 


200 


>21 


5:0 


0.02 


55 






20 


>21 


5:0 


0.02 








2 


>21 


5:0 


0.001 



a ln cases where there were not statistically significant lesutts no P value was shown. 
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Table 13 (continued) 



5 


PspA is not the only protection eliciting molecule released from pneumococcl by interference with binding to choline 

on the surface of pneumococci 


DronQratinn 
nltffjdi aUUM 


Q train f PcnA 
OU all 1 yropn 

status) 


Hnco / qc 
L/Uotf \ao 

volume in ul) 


KA aril an Dav 
fVloUlcUl Uaj 

Alive 


am i v *j . ucau 


r VdlUoo 


Pvs. all JY2141 








allR36A 


>21 


17:0 




10 


















JY2141 


1000 


>21 


2:0 








(aa 1-115) 


200 


1 


0:2 










20 


1 


0:2 










2 


1 


0:2 




15 






all JY2141 


1 


2:6 
















Pversus pooled 














cont. 


20 


CDM-CC 


Rx1 


1000 


>21 


3:0 


0.002 


clarified 


(PspA + ) 


100 


>21 


3:0 


0.002 




medium 
















WU2 


1000 


>21 


3:0 


0.002 




CFA 


(PspA+) 


100 


>21 


3:0 


0.002 


25 






3 


>21 


3:0 


0.002 






WG44.1 


1000 


>21 


5:1 


< 0.0001 






(PspA+) 


100 


2.5 


2:4 


0.002 


30 




JY1119 


1000 


>21 


3:0 


0.002 






(PspA + ) 


100 


>21 


3 : 0 


0.002 




CDM-ET 


R36A 


1000 


>21 


3:1 


0.004 


35 


clarified 


(PspA+) 


10 


>21 


4:0 


0.004 




medium 




1 


>21 


3:1 


0.004 




+ CFA 




0.2 


2 


0:4 








JY2141 


10 


>21 


2 : 0 




40 




(aa 1-115) 


1 


>21 


2:0 








all IV? 141 
all JT£ltl 




^t. 1 




0.004 






WG44.1 


100 


>21 


2:0 




45 




(PspA+) 


10 


2 


0:2 






CDM-ET only 


+ CFA 




2 


0:9 






None 


none 




1.5 


0:4 




50 


Pooled 


Controls 1 * 




2 


0: 13 





a tn cases where there were not statistically significant results no P value was shown, 
txpooted Controls* refers to XDM-ET only- Data and "None* data 
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EXAMPLE 4 - EVIDENCE FOR SIMULTANEOUS EXPRESSION OF TWO PspAs 

[0173] From Southern blot analysis there has been an issue as to whether most isolates of S. pneumoniae has two 
DNA sequences that hybridize with both 5' and 3' halves of Rx1 pspA, or whether this is an artifact of Southern blot. 
When bacterial lysates have been examined by Western blot, the results have always been consistent with the pro- 
duction of a single PspA by each Isolate. This Example provides evidence for the first time that two PspAs of different 
apparent molecular weights and different serotypes can be simultaneously expressed by the same isolate. 
[0174J Different PspAs frequently share cross-reactive epitopes, and an immune serum to one PspA was able to 
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recognize PspAs on all pneumococci. In spite of these similarities, PspAs of different strains can generally be distin- 
guished by their molecular weights and by their reactivity with a panel of PspA-specific monoclonal antibodies (MAbs). 
[0175] A serotyping system for PspA has been developed which uses a panel of seven MAbs. PspA serotypes are 
designated based on the pattern of positive or negative reactivity in immunoblots with this panel of MAbs. Among a 
5 panel of 57 independent isolates of 9 capsular groups/types, 31 PspA serotypes were observed. The large diversity 
of PspA was substantiated in a subsequent study of 51 capsular serotype 6B isolates from Alaska, provided by Alan 
Parkinson at the Arctic Investigations Laboratory of the Centers for Disease Control and Prevention. Among these 51 
capsular type 6B isolates were observed 22 different PspAs based on PspA serotype and molecular weight variations 
of PspA. 

10 [0176] While most pneumococcal strains appear to have two DNA sequences homologous with both the 5* and 3* 
halves of pspA, site-specific truncation mutations of Rx1 have revealed that one these, pspA, encodes PspA. The other 
sequence has been provisionally designated as the pspA-like sequence. At present whether the pspA-like sequence 
makes a gene product is unknown. Evidence that the pspA and pspA-like genes are homologous but distinct groups 
of alleles comes from Southern blot analysis at high stringencies. Additional evidence that pspA and the pspA-like loci 

is are distinct comes from studies using PCR primers that permit amplification of a single product approximately 2Kb in 
size from 70% of pneumococci. For the remaining 30% of pneumococci no amplification was observed with the primers 
used. 

Evidence for two PspAs : 

20 

[0177] When the strains of MC25-28 were examined with the panel of seven MAbs specific for different PspA 
epitopes, all four demonstrated the same patterns of reactivity (Fig. 4). The MAbs XIR278 and 2A4 detected a PspA 
molecule with an apparent molecular weight of 190 KDa in each isolate. In accordance with the previous PspA sero- 
typing system, the 1 90 KDa molecule was designated as PspA type 6 because of its reactivity with XiR278 and 2A4, 
25 but none of the five other MAbs in the typing system. Each isolate also produced a second PspA molecule with an 
apparent molecular weight 82 KDa. The 82 KDs PspA in each isolate was detected only with the M Ab 7D2 and was 
designated as type 34. No reactivity was detected with MAbsXI126,XI64, 1A4, or SR4W4. The fact that all four capsular 
6B strains exhibit two PspAs, based on both molecular weights and PspA serotypes, suggested that they might be 
members of the same clone. 

30 

Simultaneous nroductian of both PspAs : 

[01 78] Results from the colony ImmunoblotJng showed that both PspAs were present simultaneously in each colony 
of these isolates when grown in vitro. All colonies on each plate of the original culture, as well as all of the progeny 
35 colonies from a single colony, reacted with MAbs XIR278, 2A4, and 7D2. 

Number of pspA genes : 

[0179] One explanation for the second PspA molecule was that these strains contained an extra pspA gene. Since 
40 most strains contain a pspA gene and a pspA-\\ke gene it was expected that if an extra gene were present one might 
observe at least three pspA homologous loci in Isolates MC25-28. In Hind III digests of MC25-28 each strain revealed 
a 7.7 and 3.6 Kb band when probed with pISMpspAl 3/2 (Figure 5A). In comparison, when Rx1 DNA was digested with 
Hind III and hybridized with plSMpspA13.2, homologous sequences were detected on 9.1 and 4.2 Kb fragments as 
expected from previous studies (9) (Figure 5A). Results consistent with only two pspA- homologous genes in MC25-28 
45 were also obtained with digestion using four additional enzymes (Table 15). 

[0180] In previous studies it has been reported that probes for the 6* half of pspA (encoding the alpha-helical half of 
the protein) bind the pspAAlke sequence of most strains only at a stringency of around 90%. With chromosomal digests 
of MC25-28 we observed that the 5* Rx1 probe of pLSMpspAl 2/6 bound both pspA homologous bands at a stringency 
of greater than 95 percent. The same probe bound only the pspA containing fragment Rx1 at a stringency above 95 
so percent (Figure 5B). 

[0181] Further characterization of the pspA gene was done by RFLP analysis of PCR amplified pspA from each 
strain. Since previous studies indicated that individual strains yielded only one product, and since the amplification is 
carried out with primers based on a known pspA sequence, It seems likely that in each case the amplified products 
represent the pspA rather than the pspA-like gene. When MC25-28 were subjected to this procedure, an amplified 
55 pspA product of 2.1 Kb was produced in each case. When digested with Hha 1 digest the sum of the fragments obtained 
with each enzyme was approximately equal to the size of the 2.1 Kb amplified product (Figure 6). These results suggest 
that the 2.1 Kb amplified DNA represents the amplified product of only a single DNA sequence. Rx1 , by comparison, 
produced an amplified product of 2.0 Kb and five fragments of 0.76, 0.468, 0390, 0.349 and 0. 1 20, when digested with 
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Hha 1 as expected from its known pspA sequence. 

[0182] The four isolates examined in this Example are the first in which two PspAs have unambiguously been ob- 
served. The interpretation that two PspAs are simultaneously expressed by a single pneumococcal isolate is based 
on the observation that bands of different molecular weights were detected by different MAbs to PspA. isolates used 
s fn this study were from a group originally selected for study by Brian Spratt because of their resistance to penicillin. It 
is very likely that all four of the isolates making two PspAs are related since they share PspA serotypes, amplified pspA 
RFLPs, chromosomal pspA RFLPs, capsule type, and resistance to penicillin. 

[0183] The interpretation of studies presented here, showing the existence of two PspAs in the four strains MC25-28, 
must be set in the context of what is know about the serology PspA as detected by Western blots. PspAs of different 

10 strains have been shown previously to exhibit apparent molecular weight sizes ranging from 60 to 200 KDa as detected 
by Western blots. At least part of this difference in size is attributable to secondary structure. Even for the PspA of a 
single isolate, band of several sizes are generally observed. Mutation and immunochemistry studies have demonstrat- 
ed, however, that all of the different sized PspA band from Rx1 are made by a single gene capable of encoding a 69 
KDa protein. The heterogeneity of band size on Western blots of PspA made by a single strain appears to be due to 

is both degradation and polymerization. 

[0184] PspA was originally defined by reciprocal absorption studies demonstrating that a panel of MAbs to Rx1 
surface proteins each reacted with some protein and later by studies using Rx1 and WU2 derivatives expressing various 
truncated forms of PspA. In both cases it was clear that each MAbs to the PspA of a given strain reacted with the same 
protein. Such detailed studies have not been done with each of the several hundred human isolates. It is possible that 

20 with some isolates, reactivity of the MAbs with two PspAs may have gone unnoticed. This could have happened if all 
reactive antibodies detected both PspAs of the same Isolate, or if the most prominent migration bands from each of 
the two PspAs co-migrated. With isolates MC25-28 the observation of two PspAs was possible because dearly dis- 
tinguishable bands of different molecular weights reacted preferentially with different MAbs. 

[01 85] Applicants favor the interpretation that isolates MC25-28 each make two PspAs, because an alternative pos- 

25 sibility, namely, that the 190 KDa PspA detected by MAbs XIR278 and 2A4 might be a dimer of the 84 KDa monomer 
detected by MAb 7D2, if the epitopes recognized by the different MAbs were dependent on either the dlmeric or mon- 
omelic status of the protein, seems unlikely since whenever MAbs react with the PspA of a strain, they usually detect 
both the monomeric and the dimeric forms. No other isolates have been observed where some MAbs detected only 
the apparent dimeric form of PspA while others detected only the monomeric form. 

30 [01 86] There could be several possible explanations for the failure to observe two PspAs produced by most strains. 
1) Ail pneumococci might make two pspAs in culture, but MAbs generally recognize only one of them (perhaps in this 
isolate there has been a recombination between pspa DNA and the pspA-Wke locus, thus allowing that locus to make 
a product detected by MAb to PspA). 2) Ail pneumococci can have two pspAs but the expression of one of them 
generally does not occur under in vitro growth conditions. 3) The pspA~\\ke locus is normally a nonfunctional pseudo- 

35 gene sequence that for an unexplained reason has become functional fn these isolates. 

[0187] It seems unlikely that the expression of only a single PspA by most strains is the result of a phase shift that 
permits the expression of only the pspA or pspA-Uke gene at any one time, since many of the strains examined repeat- 
edly and consistently produce the same PspA. In the case of strains MC25-28, the appearance of two PspAs is ap- 
parently not the result of a phase switch, since Individual colonies produced both the type 6 and the type 34 PspAs. 

40 [0188] Presumably in these four strains, the second PspA protein is produced by the pspA-Wke DNA sequence. At 
high stringency, the probe comprising the coding region of the alpha-helical half of PspA recognized both pspA homol- 
ogous sequences of MC25-28 but not the pspAWke sequence of Rx1 . This finding indicates that the pspAWke sequence 
of MC25-28 is more similar to the Rx1 pspA sequence than is the Rx1 pspAWke sequence. If the pspAWke sequence 
of these strains is more similar to pspA than most pspAWke sequences, it could explain why we were able to see the 

45 products of pspAWke genes of these strains with our MAbs. The finding of two families of PspAs made in vivo by 
pneumococci, allows for use of the second PspA in compositions, as well as the use of DNA primers or probes for the 
second gene for more conclusive detecting, determining or isolating of pneumococci. 

Isolates and Bacterial Cell Culture : 

50 

[0189] Pneumococcal isolates described in these studies were cultured from patients In Barcelona, Spain (one adult 
at Beltvitge Hospital, and three children at San Juan de Dios) between 1986 and 1988 (Table 2). These penicillin 
resistant pneumococci originally in the collection of Dr. Brian Spratt were shared with applicants by Dr. Alexander 
Tomasz at the Rockefeller Institute. Rx1 is a rough pneumococcus used in previous studies, and it is the first isolate 
55 in which pspA was sequenced. Bacteria were grown in Todd-Hewitt broth with 0.5% yeast extract or on blood agar 
plates overnight in a candle jar. Capsular serotype was confirmed by cell agglutination using Danish antfsera (Statens 
Seruminstltut, Copenhagen, Denmark) as previously described. The isolates were subsequently typed as 6B by 
Queltung reaction, utilizing rabbit antfsera against 6A or 6B capsule antigen prepared by Dr. Barry Gray. 
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Bacterial tysates : 

[0190] Cell tysates were prepared by incubating the bacterial cell pellet with 0.1% sodium deoxycholate, 0.01% 
sodium dedecytsulfate (SDS), and 0.15 M sodium citrate, and then diluting the lysate in 0.5M Tris hydrochloride (pH 
5 6,8) as previously described. Total pneumococcal protein in the tysates was quantitated by the bicinchonic acid method 
(BCA Protein Assay Reagent; Pierce Chemical Company, Rockford, IL). 

PspA serotyping : 

10 [0191] Serotyping of PspA was performed according to previously published methods. Briefly, pneumococcal cell 
tysates were subjected to SDS-PAGE, transferred to nitrocellulose membranes, and developed as Western blots using 
a pane) of seven MAbs to PspA. PspA serotypes were assigned based on the particular combination of MAbs with 
which each PspA was reactive. 

*5 Colony Immunoblottlng : 

[0192] A ten ml tube of Todd-Hewftt broth with 0.5% yeast extract was inoculated with overnight growth of MC23 
from a blood agar plate. The isolate was allowed to grow to a concentration of 1 0 7 cells/rni as determined by an O.D. 
of 0.07 at 590nm. MC23 was serially diluted and spread-plated on blood agar plates to give approximately 100 cells 

20 per plate. The plates were allowed to grow overnight in a candle Jar, and a single block agar plate with well-defined 
colonies was selected. Four nitrocellulose membranes were consecutively placed on the plate. Each membrane was 
tightly weighted and left in place for 5 minutes. In .order to investigate the possibility of phase-variation between the 
two proteins detected on Western blots a single colony was picked from the plate, resuspended In ringers, and spread- 
plated onto a blood agar plate. The membranes were developed as Western blots according to PspA serotyping meth- 

23 ods. 

Chromosomal DNA Preparation : 

[0193] Pneumococcal chromosomal DNA was prepared as in Example 9. The cells were harvested, washed, fysed, 
so and digested with 0.5% (wtA/ol) SDS and 1 0O^ig/ml proteinase K at 37°C for 1 hour. The celt wall debris, proteins, and 
potysccharides were complexed with 1% hexadecyl trimethyl ammonium bromide (CTAB) and 0.7M sodium chloride 
at 65°C for 20 minutes, then extracted with chtoroform/lsoamyt alcohol. DNA was precipitated with 0.6 volumes Iso- 
propanol, washed, and resuspended In 10mM Tris-HCL, 1mM EDTA, pH 8.0. DNA concentration was determined by 
spectrophotometric analysis at 260nm. 

33 

Probe preparation : 

[0194] 5' and 3* oligonucleotide primers homologous with nucleotides 1 to 26 and 1 967 to 1 990 of Rx1 pspA (LSM 
13 and LSM2, respectively) were used to amplify the full length pspA and construct probe LSMpspA13/2 from Rx1 
40 genomic DNA. 5* and 3* oligonucleotide primers homologous to nucleotides 1 61 to 1 87 and nucleotides 1 093 to 1 1 1 7 
(LSM 12 and LSM 6, respectively) were used to amplify the variable alpha-helical region to construct probe 
LSMpspA12/6. PCR generated DNA was purified by Gene Clean (Bio101 Inc., Vista, CA) and random prime-labeled 
with digoxigenln-11-dUTP using the Genius 1 Nonradioactive DNA Labeling and Detection Kit as described by the 
manufacturer (Boehringer Mannheim, Indianapolis, IN). 

45 

DNA electrophoresis: 

[01 95] For Southern blot analysis, approximately 1 Oug of chromosomal DNA was digested to completion with a single 
restriction endonuclease, (Hind III, Kpn 1 , EcoR 1 , Dm 1 1 or Pst 1 ) then electrophoresed on a 0.7% agarose get for 
so 1 6-1 B hours at 35 volts. For PCR analysis, 5ut of product were Incubated with a single restriction endonuclease, {Bel 
1 , BamH 1 , Pst 1 , Sac 1 , EcoR 1 Sma 1 , and Kpn 1 ) then electrophoresed on a 1 .3% agrose gel for 2-3 hours at 90 
volts. In both case, 1 Kb DNA ladder was used for molecular weight makers (BRL, Gaithersburg, MD) and gets were 
stained with ethidlum bromide for 10 minutes and photographed with a ruler. 

55 Southern blot hybridization 

[0196] The DNA in the gel was depurinated in 0.25N HCL for 1 0 minutes, denatured in 0.5M NaOH and 1 .5M NsCI 
for 30 minutes, and neutralized in 0.5M Tric-HCI (pH 7.2), 1 .5M NaCI and 1mM disodium EDTA for 30 minutes. DNA 
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was transferred to a nylon membrane (Micron Separations INC, MA) using a POSIBLOT pressure blotter (Strategene, 
La Jolla, CA ) for 45 minutes and fixed by UV irradiation. The membranes were prehybridized for 3 hours at 42°C in 
50% formamlde, 5X SSC, 5X Denhardt solution, 25mM sodium phosphate (pH 6.5), 0.5% SDS 3% (wt/vol) dextran 
sulfate and 500u>g/ml of denatured salmon containing 45% form amide, 5X SSC, 1 X Denhardt solution, 20mM sodium 

5 phosphate (pH 6.5), 0.5% SDS, 3% dextran sulfate, 250^g/ml denatured sheared salmon sperm DNA and about 20ng 
of heat-denatured diogoxigenin-labeled probe DNA. After hybridization, the membranes were washed twice in 0.1% 
SDS and 2X SSC for 3 minutes at room temperature. The membranes were washed twice to a final stringency of 0. 1 % 
SDS in 0.3X SSC at 65° C for 15 minutes. This procedure yields a stringency greater than 95 percent. The membranes 
were developed using the Genius 1 Nonradioactive DNA Labeling and Detection Kit as described by the manufacturer 

10 (Boehringer Mannheim, Indianapolis, IN). To perform additional hybridization with other probes, the membranes were 
stripped in 0.2N NaOH/0.1%SDS at 40°C for 30 minutes and then washed twice in 2X SSC. 

Polymerase Chain Reaction (PCR): 

15 [01 97] 5* and 3' primers homologous with the DNA encoding the N- and C-terminal ends of PspA (LSM 1 3 and LSM2, 
respectively) were used in these experiments. Amplifications were made using Taq DNA polymerase, MgCI 2 and 10X 
reaction buffer obtained from Promega (Madison, Wl). DNA used for PCR was prepared using the method previously 
described in this paper. Reactions were conducted in 50ml volumes containing 0.2mM of each dNTP, and 1 ml of each 
primer at a working concentration of 50mM. MgCfe was used at an optimal concentration of 1 .75mM with 0.25 units of 

20 Taq DNA polymerase. Ten to thirty ng of genomic DNA was added to each reaction tube. The amplification reactions 
were performed in a thermal cycler (M.J. Research, Inc.) using the following three step program. Step 1 consisted of 
a denaturing temperature of 94°C for 2 minutes. Step 2 consisted of 9 complete cycles of a denaturing temperature 
of 94° C for 1 minute, an annealing temperature of 50°C for 2 minutes, and an extension temperature of 72°C for 3 
minutes. Step 3 cycled for 19 times with a denaturing temperature of 94°C for 1 minute, an annealing temperature of 

23 60° C for 2 minutes, and an extension temperature of 72°C for 3 minutes. At the end of the last cycle, the samples were 
held at 72°C for 5 minutes to ensure complete extension. 

Band size estimation : 

30 [0198] Fragment sizes in the molecular weight standard and in the Southern blot hybridization patterns were calcu- 
lated from migration distances. The standard molecular sizes were fitted to a logarithmic regression model using Cricket 
Graph (Cricket Software, Malvern, PA). The molecular weights of the detected bands were estimated by entering the 
logarithmic line equation obtained by Cricket Graph into Microsoft Excel (Microsoft Corporation, Redmond, WA) in 
order to calculate molecular weights based in migration distances observed in the Southern blot. 

35 
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EXAMPLE 5 - SOUTHERN BLOT ANALYSIS OF pspAs AND FRAGMENTS OF pspA 



[0199] In this example, Applicants used oligonucleotides derived from the DNA sequence of pspA of S. pneumoniae 
Rx1 both as hybridization probes and as primers in the polymerase chain reaction to investigate the genetic variation 

s and conservation of the different regions of pspA and pspA-toke sequences. The probes used ranged in size from 1 7 
to 33 bases and included sequences representing the minus 35 , the leader, the a-hetical region, the proline-rich regions, 
the repeat regions, and the C-terminus. Applicants examined 1 8 different isolates representing 1 2 capsular and 9 PspA 
serotypes. The proline-rich, repeat, and leader, regions were highly conserved among pspA and pspA-Uke sequence. 
[0200] in the previous Example, It was shown that strain Rx1 and most other strains of S. pneumoniae had two 

10 homologous sequences that could hybridize with probes encoding the N terminal and C terminal halves of PspA. This 
conclusion that these were separate sequences was supported by the fact that no matter which restriction enzymes 
was used there were always at least two (generally two sometimes three or four) restriction fragments of Rx1 and most 
other strains hybridized with the pspA probes. When the genome of Rx1 was digested with HindlW and hybridized with 
these, two pspA-homologous sequences were found to be in 4.0 and 9.1 kb fragments. Using derivative of Rx1 which 

'5 had insertion mutations in pspA, it was possible to determine that the 4.0 kb fragment contained the functional pspA 
sequence. The pspA-homologous sequence included within the 9. 1 kb band was referred to as the pspA-Uke sequence. 
Whether or not the pspA-like sequences makes a product is not know, and none has been identified in vitro. Since 
pspAspecific mutants can be difficult to produce in most strains, and exist for only a limited number of pneumococcal 
isolates, this Example identifies oligonucleotide probes that could distinguish between the pspA and pspA-Wke se- 

20 quences. 

[0201] The purpose of this Example was to further define both the conserved and variable regions of pspA, and to 
determine whether the central proline-rich region is variable or conserved, and identify those domains of pspA that are 
most highly conserved in the pspA-toke sequence (and ergo, provide oligonucleotides that can distinguish between the 
two). Oligonucleotides were used and are therefore useful as both hybridization probes and as primers for polymerase 
25 chain reaction (PCR) analysis. 

Hybridization with oligonucleotide probes 

[0202] The oligonucleotides used in this study were based on the previously determined sequence of Rx1 PspA. 

30 Their position and orientation relative to the structural domains of Rx1 PspA are shown in Figure 7. The reactivity of 
these oligonucleotide probes with the pspA and pspA-like sequences was examined by hybridization with a HindlW 
digest of Rx1 genomic DNA (Table 1 7). As expected, each of the eight probes recognized the pspA-containlng 4.0 kb 
fragment of the HindlW digested Rx1 DNA. Five of the 8 probes (LSM1 , 2, 3, 7 ( and 1 2) could also recognize the pspA 
like sequence of the 9.1 kb band at least at low stringency. At high stringency four of the probes (LSM2, 3, 4 and 5) 

35 were specific for the 4.0 kb. 

[0203] These 8 probes were used to screen Hindlll digest of the DAN from 1 8 strains of S. pneumoniae at low and 
high stringency. For comparison to earlier studies each of the strains was also screened using a f ulMength pspA probe. 
Table 23 illustrates the results obtained with each strain at high stringency. Table 1 8 summarizes the reactivities of the 
probes with the strains at high and low stringency. Strain Rx1 is a laboratory derivative of the clinical isolate, D39. The 

40 results obtained with both strains were identical. They are listed under a single heading in Table 23 and are counted 
as a single strain in Table 28. Although AC1 7 and AC94 are related clinical isolates, they have distinguishable pspAs 
and are listed separately* All of the other strains represent independent Isolates. 

[0204] The only strain not giving at least two pspri-homologous HindlW fragments was WU2. This observation was 
expected since WU2 was previously shown to have only one pspA-homologous sequence and to give only a single 

45 HindlW fragment that hybridizes with Rx1 pspA. Even at high stringency 6 of the 8 probes detected more than one 
fragment in at least one of the 1 8 strains Tables 1 8 and 23. Probes LSM7, 1 0 and 12 reacted with DNA from a majority 
of the strains and detected two fragments in over 59% of the strains they reacted with. In almost every case the frag- 
ments detected by the oligonucleotide probes were identical in size to those detected by the full-length pspA probe. 
Moreover, the same pairs of fragments were frequently detected by probes from the 3' as well as the 5' half Rx1 pspA. 

so These results are consistent with earlier findings that the pairs of Hindu fragments from individual Isolated generally 
include two separate but homologous sequences, rather than fragments of a single pspA gene. 
[0205] The differences in the frequency with which the oligonucleotides reacted with (at least one fragment) of the 
strains in the panel was significant at P< 0.0001 by 2 x 8 chi square). When the oligonucleotides were compared in 
terms of their ability to react with both fragments of each strain the P value was also < 0.0001 . Table 1 8 gives the 

55 percentage of strains reactive with each probe, the percentage in which only one fragment was reactive, and the 
percentage in which two (or more) fragments were reactive. 

[0206] The last column in Table 1 8 give the ratio of strains that showed one reactive HindlW fragment at high stringency 
divided by the total number of reactive strains. In this column values of 1 were obtained with probes that only reacted 
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with one band in each reactive strain. Such probes are assumed to be those that are most specific for pspA. The lowest 
values were obtained with probes that generally see two bands in each strain. Such probes are assumed to be those 
that represent regions relatively conserved between the pspA and pspA-Uke sequences. At high stringency, probes 
LSM3 and LSM4 detected only a single H/ndlll fragment in the DAN of strains they reacted with. These findings sug- 

5 gested probes LSM3 and LSM4 were generally detecting alleles of pspA rather than the p$pA-\\ke sequence. The 
observation that the fragments detected by LSM3 or LSM4 were also detected by all of the other reactive probes, 
strengthened the conclusion that these probes generally detected the pspA rather than the pspA-Wke sequence. WU2 
has only one pspA-homologous DNA sequence and secretes a serologically detectable PspA. The fact that LSM3 
reacts with the single HindlU fragment of WU2 ts consistent with the interpretation that LSM3 detects the pspA se- 

10 quences. Sequences representing the second proline region (LSM1) and the C-terminus (LSM2) appeared to also be 
relatively specific for the pspA sequences since they were generally detected in only one of the HindlU fragments of 
each strain. 

[0207] Oligonucleotides, LSM12, and LSM1 0 detected the most conserved epitopes of pspA and generally reacted 
with both pspA-homologous fragments of each strain (Table 18). LSM7 was not quite as broadly cross-reactive but 
15 detected two PspAs in 41 % of strains including almost 60% of the strains it reacted with. Thus, sequences representing 
the leader, first proline region, and the repeat region appear to be relatively conserved not only within pspA but between 
the pspA and pspA-Wke sequences. LSM3, 4, and 5 reacted with the DNA from the smallest fraction of strains of any 
oligonucleotide (29-35 percent), suggesting that the portion of pspA encoding the a-helical region is the least con- 
served region of pspA. 

20 [0208] With two strains BG85C and L81 905, the oligonucleotides detected more than two HindUl fragments contain- 
ing pspA-homologous sequences. Because of the small size of the oligonucleotide probes and the absence of HlndiW 
restriction sites within any of them, it is very unlikely that these multiple fragments were the results of fragmentation of 
the target DNA within the probed regions. In almost every case the extra oligonucleotides were detected at high strin- 
gency by more than one oligonucleotide. These data strongly suggest that at least in these two strains there are 3 or 

25 4 sequences homologous to at least portions of the pspA. The probes most reactive with these additional sequences 
are those for the leader, the a-helical region and the proline rich region. The evidence for the existence of these addi- 
tional pspA-related sequences was strengthened by results with BG58C and L81 905 at low strl ngency where the LSM3 
(a-helicaJ) primer picked up the extra 1 .2 kb band of L81 905 (in addition to the 3.6 kb band) and the LSM7 (proline- 
rich) primer picked up the extra 3.2 and 1 .4 kb bands (in addition to the 3.6 kb band) of BG58C. 

30 

Amplification of pspA 

[0209] The utility of these oligonucleotides as PGR primers was examined by determining if they could amplify frag- 
ments of pspA from the genomic DNA of different pneumococcal isolates. Applicants attempted to amplify pspAs from 

as 14 diverse strains of S. pneumoniae comprising 12 different capsular types using primers based on the Rx1 pspA 
sequence. Applicants observed that the 3* primer LSM2, which is located at the 3* end of pspA, would amplify an 
apparent pspA sequence from each of the 14 pneumococcal strains when used in combination with LSM1 located in 
the region of pspA encoding the proline-rich region (Table 19). LSM2 was also used in combination with four other 5' 
primers LSM1 , 3, 7, 8 and 12. LSM8 is located 5* of the pspA start site (near the - 35 region). 

40 [0210] If a predominant sequence of the expected length was amplified that could be detected on a Southern blot 
with a full-length pspA probe, we assumed that pspA gene of the amplified DNA had homologous sequences similar 
to those of the pspA primers used. Based on these criteria the primer representing the a-helical sequence was found 
to be less conserved than the primers representing the leader, proline, and C-termlnal sequences. These results were 
consistent with those observed for hybridization. The lowest frequency of amplification was observed with LSM8 which 

45 is from the Rx1 sequence 5* of the pspA start site. This oligonucleotide was not used in the hybridization studies. 

[021 1 ] Further evidence for variability comes from differences in the sizes of the amplified pspA gene. The Example 
showed that when PCR primers LSM12 and LSM2 were used to amplify the entire coding region of PspA, PGR products 
from different pneumococcal isolates ranged in size from 1 .9 and 2.3 kb (Table 20), The regions within pspA encoding 
the a-helical, proline-rich, and repeats were also amplified from the same isolates. As seen in Table 20, the variation 

so in size of pspA appeared to come largely from variation in the size of pspA encoding encodes the a-helical region. 
[0212] Using probes that consisted of approximately the 5' and 3' halves of pspA it has been determined that the 
portion of pspA that encodes the a-helical regions is less conserved than the portion of pspA that encodes the C- 
terminal half of the molecule. This Example show using 4 oligonucleotide probes from within each half of the DNA 
encoding PspA. Since a larger number of smaller probes were used, Applicants have been able to obtain a higher 

55 resolution picture of conserved and variable sequences within pspA and have also been able to Identify regions of 
likely differences and similarities between pspA and the pspA-like sequences. 

[0213] The only strains in which the pspA gene has been identified by molecular mutations are Rx1 , D39 and WU2. 
Rx1 and D39 apparently have identical pspA molecules that are the result of the common laboratory origin of these 
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two strains. WU2 lacks the pspA-like gene. Thus, when most pneumococci are examined by Southern blotting using 
full length-pspA as a probe, it is not possible to distinguish between the pspA and pspA-Wke loco, since both are readily 
detected. A major aim of these studies was to attempt to identify conserved and variable regions within the pspA and 
pspA-Uke loci. A related aim was to determine whether probes based on the Rx1 pspA could be identified that would 

5 permit one to differentiate pspA from the pspA-Wke sequence. Ideally such probes would be based on relatively con- 
served portion of the pspA sequence that was quite different in the pspA-Uke sequence. A useful pspA specific probe 
would be expected to identify the known Rx1 and WU2 pspA genes and identify only a single H'mdlW fragment in most 
other strains. Two probes (LSM3 and LSM4) never reacted with more than one pspA-homofogous sequence in any 
particular strain. Both of reacted with Rx1 pspA and LSM3 reacted with WU2 pspA. Each of these probes reacted with 

10 4 of the other 1 5 strains. When these probes identified a band, however, the band was generally also detected by ail 
other Rx1 probes reactive with that strain's DNA. Additional evidence that the LSM3 and LSM4 were restricted to 
reactivity with pspA was that they reacted with the same bands in all three non-Rx1 strains. Each probe identifies pspA 
in certain strains and even when used in combination they recognized pspA in over 40 percent of strains. Probes for 
the second proline-rich region (LSM1) and the C-terminus of pspA (LSM2) generally, but not always, identified only 

15 one pspA-homologous sequence at high stringency. Collectively LSM1 , 2, 3, and 4 reacted with 1 6 of the 1 7 Isolates 
and in each case revealed a consensus band recognized by most to all of the reactive probes. 
[0214] By making the assumption that in different strains the Rx1 pspA probes are more likely to recognize pspA 
than the pspA-like sequences, it is possible to make some predictions about areas of conservation and variability within 
the pspA and pspA-Wke sequences. When a probe detected only a single pspA-homologous sequence in an isolate, 

20 it was assumed that it was pspA. If the probe detected two pspA-homologous sequences, ft was assumed that it was 
reacting with both the pspA and pspA-Uke sequence. Thus, the approximate frequency with which a probe detects 
pspA can be read from Table 18 as the percent of strains where it detects at least one pspA-homologous band. The 
approximate frequency with which the probes detect the pspA-like sequence is the percent of strains in which two or 
more pspA-homologous band are detected. 

25 [0215] Using these assumptions the most variable portion of portion of the pspA gene was observed to be the -35 
region and the portion encoding a-helical region. The most conserved portion of pspA was found to be the repeat 
region, the leader and the proline rich region. Although only one probe from the region was used, the high degree of 
conservation among the 10 repeats in the Rx1 sequence makes it likely that other probes for the repeat regions give 
similar results. 

30 [021 6] The portion of the pspA-Wke sequence most similar to Rx1 pspA was that encoding the leader sequence, the 
5' portion of the proline rich region, and the repeat region, and those portions encoding the N-terminal end of the proline- 
rich and repeat regions. The repeat region of PspA has been shown to be involved in the attachment to PspA to the 
pneumococcal surface. The conservation of the repeat region among both pspA and pspA-Wke genes suggests that if 
is PspA-like protein is produced, that it may have a surface attachment mechanism similar to that of PspA. The need 

35 for a functional attachment site may explain the conservation of the repeat region. Moreover, the conservation in DNA 
encoding the repeat regions of the pspA and pspA-Uke genes suggests that the repeat regions may serve as a potential 
anti-pneumococcal drug target. The conservation In the leader sequence between pspA and the pspAWke sequence 
was also not surprising since similar conservation has been reported for the leader sequence of other gram positive 
proteins, such as M protein of group A streptococci. It is noteworthify, however, that there is little evidence at the DNA 

40 level that the PspA lead is shared by many genes other than PspA and the possible gene product of the pspA-Wke locus. 
[0217] Although the region encoding the C-terminus of pspA (LSM12) or the 3* portion of the proline-rich sequence 
(LSM1 ) appear to be highly conserved within pspA genes, corresponding regions in the pspMXke sequences are either 
lacking, or very distinct from those In pspA, The reason for conservation at these sites is not apparent. In the case of 
the PspA, its C-terminus does not appear to be necessary for attachment, since mutants lacking the C-terminal 49 

45 amino acids are apparently as tightly attached to the cell surface as those with the complete sequence. Whether these 
difference from pspA portends a subtle difference in the mechanism of attachment of proteins produced by these two 
sequences in unknown. If the C-terminal end of the pspA-Wke sequence, or the 3* portion of the proline-rich sequence 
in the pspA-like sequence are as conserved within the pspA-Wke family of genes as it is within pspA, then this region 
of pspA and the pspA-Wke sequence serve as targets for the development of probes to distinguish between all pspA 

so and pspA-Wke genes. 

[0218] With two strains, some of the oligonucleotide probes identified more than two pspA-homologous sequences. 
In the case of each of these strains, there was a predominant sequence recognized by almost all of the probes, and 
two or three additional sequences that were each recognized by at least two of the probes. One interpretation of the 
data is that there may be more than two pspA-homologous genes in some strains. The significance of such sequences 
55 is far from established, it is of interest however, that although the additional sequences is far from established. It is of 
interest however, that although the additional sequences share areas of homology with the leader, a-helical, and proline 
region, they exhibited no homology with the repeat region of the C-terminus of pspA. These sequences, thus, might 
serve as elements that can recombine with pspA and/or the pspA-like sequences to generate sequence diversity. 
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Alternatively the sequences might produce molecules with very different C-terminal regions, and might not be surface 
attached. If these pspA-like sequences make products, however, they, like PspA, may be valuable as a component of 
a pneumococcal antigenic, immunological vaccine compositions. 

5 Bacterial strains, growth conditions and isolation of chromosomal DNA 

[021 9] S. pneumoniae strains used in this study are listed in Table 5. Strains were grown in 1 00 ml of Todd-Hewitt 
broth with 0.5% yeast extract at 37°C to an approximate density of 5x1 0 8 cells/ml. Following harvesting of the cells by 
centrifugation (2800xg, 10 minutes), the DNA was isolated as previously described and stored at 4°C in TE (10mM 
10 Tris, 1mM EDTA, pH 8.0). 

Amplification of pspA sequences 

[0220] Polymerase chain reaction (PCR) primers, which were also used as oligonucleotide probes in Southern hy- 
15 brtdizatlons, were designed based on the sequence of pspA from pneumococcal strain Rx1 . These oligonucleotides 
were obtained from Olfgos Etc. (Witsonville, OR) and are listed in Table 22. 

[0221] PCRs were done with a MJ Research, Inc., Programmable Thermal Cycler (Watertown, MA) as previously 
described using approximately 1 0 ng of genomic pneumococcal DAN with appropriate 5' and 3* primer pair. The sample 
was brought to a total volume of 50 uJ containing a final concentration of 50mM KCI, 1 OmM Tris-HCi (PH 8.3), 1 .5 mM 

20 MgClj, 0.001% gelatin, 0.5 mM each primer, 200mM of each deoxynucleotide triphosphate, and 2.5 U of Tag DNA 
polymerase. Following overlaying of the samples with 50 uJ of mineral oil, the samples were denatured at 94°C for 2 
minutes. Then the samples were subjected to 10 cycles consisting of 1 minute at 94°C, 2 minutes at 50° C, and 3 
minutes at 72°C followed by another 20 cycles of 1 minute at 94°C, 2 minutes at 50°C, and 3 minutes at 72°C followed 
by another 20 cycles of 1 minute at 94°C, 2 minutes at 60°C, and 3 minutes at 72° C. After all 30 cycles, the samples 

25 W ere held at 72°C for an additional 5 minutes prior to cooling to 4°C. The PCR products were analyzed by agarose 
gel electrophoresis. 

DNA hybridization analysis 

30 [0222] Approximately 5u.g of chromosomal DNA was digested with HlndlU according to the manufacturer's instruc- 
tions (Promega, Inc., Madison, Wl). The digested DNA was electrophoreses^ at 35 mV overnight in a 0.8% agarose 
gels and then vacuum-blotted onto Nytran membranes (Schleicher & Schuell, Keene, NH). 

[0223] Labeling of oligonucleotide with and detection of probe-target hybrids were both performed with the Genius 
System according to the manufacturer's instructions (Mannheim, Indianapolis, IN). All hybridizations were done for 18 

6 hours at 42°C without formamide. By assuming that 1 % base-pair mismatching results in a 1 °C decrease in Tm des- 
ignations of "high" and How" stringency were defined by salt concentration and temperature of post-hybridization wash- 
es. Homology between probe and target sequences was derived using calculated Tm the established method. High 
stringency is defined as 90% or greater homology, and low stringency is 80-85% sequence homology. 

40 Table 17. 



Hybridization of oligonucleotides with Hindlll restriction fragments of Rx1 DNA. 






Stringency 


Oligonucleotide 


Region 


Low 


High 


LSM12 


Leader 


N.D. 


4.0,9.1 


LSM5 


oc-hellx 


N.D. 


4.0 


LSM3 


a-helix 


4.0,9.1 


4.0 


LSM4 


a-helix 


4.0 


4.0 


LSM7 


Proline 


4.0, 9.1 


4.0, 9.1 


LSM1 


Proline 


4.0, 9.1 


4.0, 9.1 


LSM10 


Repeats 


N.D. 


4.0, 9.1 


LSM2 


C-terminus 


4.0, 9.1 


4.0 


Note. Values indicated are the sizes of restrict 


tion fragments e> 


(pressed as kb. 
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Table 18. 



Summary of Hybridization at High and Low Stringency of 8 Oligonucleotides with Hindlll Restriction Fragments of 

the 17 Pneumococcal Isolates Listed in Figure 2 



Oligonucleotide 


Percent 


Percent 


Percent 


1 band/ 




with £ 1 band 


with £2 bands 


with 1 band 


£ 1 band 




Low 


High 


Low 


High 


Low 


High 


Low 


High 


LSM12 




82 




59 




24 




0.29 


LSM5 




29 




18 




12 




0.40 


LSM3 


65 


35 


41 


0 


24 


35 


0.36 


1.00 


LSM4 


35 


29 


0 


0 


35 


29 


1.00 


1.00 


LSM7 


94 


71 


71 


41 


24 


29 


0.25 


0.42 


LSM1 


100 


65 


53 


12 


47 


53 


0.47 


0.82 


LSM10 




94 




59 




35 




0.37 


LSM2 


88 


53 


41 


12 


47 


41 


0.63 


0.78 



10 



15 



20 



Note, for all values listed all 1 7 strains were examined. 
If no value is listed, then no strains were examined. 



Table 19. 



25 



30 



35 



Amplification of Pneumococcal Isolates using the Indicated 5' Prime Combination with the 3' Primer LSM2 at the 3' 

end of pspA 



5' Primer 


Region 


Nucleotide Position 


Amplified/ 
Tested 


Percent 
Amplified 


LSM8 


-35 


47 




to 


70 


2/14 


14 


LSM12 


leader 


162 




to 


188 


8/14 


57 


LSM3 


ot-helical 


576 




to 


598 


3/14 


21 


LSM7 


proline 


1093 




to 


1117 


12/14 


86 


LSM1 


proline 


1312 




to 


1331 


14/14 


100 



(P < 0.0001 ). The tendency for there to be more amplification with the 3* most primers was significant at P < 0.0001 . 



Table 20. 



Size of amplified pspA fragments in kilobases 


pspA 
Region 


Primer Pairs 


number of 

pspAs 
examined 


Size 


Range 


S.D. 


Full length 
ct-helical 
Proline 
Repeats 


LSM12 +- LSM2 
LSM12 + LSM6 
LSM7 + LSM9 
LSM1 + LSM2 


9 
6 
3 
19 


1.9-2.3 
1.1-1.5 

0.23 
0.6 - 0.65 


0.4 
0.4 
0 
0:05 


0.17 
0.17 

0 
0.01 


Note: amplification was attempted with each set of pn 
for pspAs that could be amplified with the indicated prime 


mers on a panel of 1 
*r pairs. 


9 different pspAs. Data is shown only 



Table 21. 



Pneumococcal Strains 


Strain 


Relevant characteristics 


WU2 


Capsular type 3, PspA type 1 
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Table 21 . (continued) 



Pneumococcal Strains 


D39 


Capsular type 2, PspA type 25 


R36A 


Nonencapsulated mutant of D39, 




PspA type 25 


Rx1 


Nonencapsulated variant of R36A, 




PspA type 25 


DBL5 


Capsular type 5, PspA type 33 


DBL6A 


Capsular type 6A, PspA type 1 9 


A66 


Capsular type 3, PspA type 13 


AC94 


Capsular type 9L, PspA type 0 


AC17 


Capsular type 9L, PspA type 0 


AC40 


Capsular type 9L, PspA type 0 


AC107 


Capsular type 9V, PspA type 0 


AC 100 


Capsular type 9V, PspA type 0 


AC 4 Aft 


Capsular type 9N, PspA type 18 


D109-1B 


Capsular type 23, PspA type 12 


BG9709 


Capsular type 9, PspA type 0 


BG58C 


Capsular type 6A, PspA type ND 


L81905 


Capsular type 4, PspA type 25 


L81905 


Capsular type 4, PspA type 25 


L82233 


Capsular type 14, PspA type 0 


L82006 


Capsular type 1 , PspA type 0 



Table 22 PCR Primers 


Primer 


Sequence (5* to 3") 


LSM1 


CCGQATCCAGCTCCTGCACCAAAAAC 


LSM2 


GCGCGTCGACGGCTTAAACCCATTCACCATTGG 


LSM3 


CCGGATCXn^GCCAGAGCAGTTGGCTG 


LSM4 


CCGGATCCGCTCAAAGAGATTGATGAGTCTG 


LSM5 


GCGGATCCCGTAGCCAGTCAGTCTAAAGCTG 


LSM6 


CTGAGTCGACTGGAGTTTCTGGAGCTGGAGC 


LSM7 


CCGGATCCAGCTCCAGCTCCAGAAACTCCAG 


LSM8 


GCGGATCCTTGACCAATATTTACGGAGGAGGC 


LSM9 


GTTTTTGGTGCAGGAGCtGG 


LSM10 


GCTATGGCTACAGGTTG 


LSM11 


CCACCTGTAGCCATAGC 


LSM12 


CCGGATCCAGCGTGCCTATCTTAGGGGCTGGTT 


LSM13 


GCAAGCTTATGATATAGAAATTTGTAAC 
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EXAMPLE 6 ■ RESTRICTION FRAGMENT LENGTH POLYMORPHISMS OF pspA REVEALS GROUPING 



[0224] Pneumococcal surface A (PspA) Is a protection eliciting protein of Streptococcus pneumoniae. The deduced 
amino acid sequence of PspA predicts three distinct domains; an a helical coiled-coil region, followed by two adjacent 

3 proline-rich regions, and ten 20 amino acid repeats. Almost all PspA molecules are cross-reactive with each other in 
variable degrees. However, using a panel of monoclonal antibodies specific for individual epitopes, this protein has 
been shown to exhibit considerable variability even within strains of the same capsular type. Oligonucleotide primers 
based on the sequence of pspA from S. pneumoniae Rx1 were used to amplify the full-length pspA gene and the 5' 
portion of the gene including the a-helical and the proline-rich region. PCR-amplified product were digested with Hha 

io | or Sau3A I to visualize restriction fragment length polymorphism of pspA. Although strains were collected from around 
the world and represented 21 different capsular types, isolates could be grouped into 1 7 families or subfamilies based 
on their RFLP pattern. The validity of this approach was confirmed by demonstrating that pspA of individual strains 
which are known to be clonally related were always found within a single pspA family. 

[0225] Numerous techniques have been employed in epidemiological surveillance of pneumococci which include 
is serotyplng, ribotyplng, pulsed field electrophoresis, multilocus enzyme electrophoresis, penicillin-binding protein pat- 
terns, and DNA fingerprinting. Previous studies have also utilized the variability of pneumococcal surface protein A 
(PspA) to differentiate pneumococci. This protein, which can elicit protective antipneumococcai antibodies, is a viru- 
lence factor found on alt pneumococcal isolates. Although PspA molecules are commonly cross-reactive, they are 
seldom antlgenically identical. This surface protein is the most serologically diverse protein know on pneumococci; 
20 therefore, it is an excellent market to be used to follow individual strains. Variations in PspA and the DNA surrounding 
its structural gene have proven useful for differentiation of S. pneumoniae. 

[0226] When polyclonal sera are used to identify PspA, cross-reaction is observed between virtually all isolates. 
Conversely, when panels of monoclonal antibodies are used to compare PspA of independent isolation they are almost 
always observed to express different combinations of PspA epitopes. A typing system based on this approach has 
25 limitations because it does not easily account for differences in monoclonal binding strength to different PspA molecules. 
Moreover, some strains are weakly reactive with individual monoclonal antibodies and may not always give consistent 
results. 

[0227] A less ambiguous typing system that takes advantage of the diversity of PspA was therefore necessary to 
develop and was used to examine the clonality of strains. This method involves examination of the DNA within and 

30 adjacent to the pspA locus. Southern hybridizations of pneumococcal chromosomal DNA digested with various endo- 
nudeases, such as Hind III, Dm I, or Kpn I, and probed with labeled pspA provided a means to study the variability of 
the chromosome surrounding pspA. When genomic DNA is probed, the pspA and the pspA-Wke loci are revealed. In 
most digests the pspA probe hybridizes to 2-3 fragments and, digests of independent isolates were generally dissimilar. 
[0228] Like the monoclonal typing system, the Southern hybridization procedure permitted the detection of clones 

& of pneumococci. However, ft did not provide a molecular approach for following pspA diversity. Many of the restriction 
sites defining the restriction fragment length polymorphism (RFLP) were outside of the pspA gene, and It was difficult 
to differentiate the pspA gene from the pspA-like locus. In an effort to develop a system to follow pspA diversity Appli- 
cants examined the RFLP of PCR-amplified pspA. Amplified pspA was digested with SauGA I and Hha I, restriction 
enzymes with four base recognition sites. To evaluate the utility of this approach pspA from clinical and laboratory 

40 strains known to be clonally related as well as random isolates were examined. 

Bacterial strains 

[0229] Derivatives of the S. pneumoniae D39-Rx1 family were kindly provided by Rob Massure and Sanford Lacks 
45 (Figure 8). Eight clinical Isolates from Spain and four isolates from Hungary, a gift from Alexander Tomasz. Seventy- 
five random clinical isolates from Alabams, Sweden, Alaska, and Canada were also studied. 

PCR amplifications 

so [0230] The oligonucleotide primers used in this study are listed In Table 24. Chromosomal DNA, which was isolated 
according to procedures described by Dlllard et al„ was used as template for the PCR reactions. Amplification was 
accomplished in a 50 jil reaction containing approximately 50 ng template DNA, .25U Tag, 50 jiM of each primer, 175 
HM MgCfe, and 200 jiM In a reaction buffer containing 10 fiM Tris-HCI, pH 9.0, 50nM KCi, 0.1% Triton X-100, 
0.01% wt/vol. gelatin. The mixture was overlaid with mineral oil, and placed in a DNA thermal cycler. The amplification 

55 program consisted on an initial denaturation step at 94°C, followed by 29 cycles opf 94°C for 1 min, 55°C for 2 min, 
and 72°C for 3 min. The final cycle Included an incubation at 72°C for 5 min. 
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Restriction fragment analysis of PCR-amplified product 

[0231] Aliquots of the PCR mixtures were digested with Hha I or SauSA I in a final volume of 20uJ according to 
manufacturer's protocols. After digestion the DNA fragments were electrophoresed on a 1 .3% TBE agarose gel and 

5 stained with ethidium bromide. Fragment sizes were estimated by comparison to a 1 kb DNA ladder (Gtoco BRL). 
[0232] Because of the variability of pspA, and the fact that the entire pspA sequence is known for only one gene, it 
has not been possible to design primers which amplify pspA from 100% of pneumococcal strains. However, oligonu- 
cleotide primers, LSM2 and LSM1 , can amplify an 800 bp region of the C-terminaJ end in 72 of the 72 stains tested. 
Based on hybridizations at different stringencies, this region was found to be relatively conserved in pneumococcal 

10 strains, and thus would not be expected to be optimal for following restriction polymorphisms within the pspA molecule. 
LSM13 and LSM2, primers which amplify the full length pspA gene, can amplify pspA from approximately 79% 55/75 
of the strains tested (Table 25). 

Stability of amplified RFLP pattern within clonally related pneumococci 

15 

[0233] To determine the stability of pspA during long passages in vitro, we examined the RFLP pattern of the pspA 
gene of the derivatives of the S. pneumoniae D39-Rx1 family. Rx1 is an acapsular derivative of S. pneumoniae D39, 
the prototypical pneumococcal laboratory strain isolated by Avery in 1914. Throughout the 1900's spontaneous and 
chemical mutations have been introduced into D39 by different laboratories (Figure 8). During this period unencapsu- 

20 lated strains were maintained in vitro, and D39 was passed both in vivo and in vitro passage. All the derivatives of 
D39, including Rx1 , R6 t RNC, and R36A, produced a 1 .9 kb fragment upon PCR amplification of full length pspA. All 
members of the family exhibited the RFLP pattern. Digestion with Sau3A I of PCR amplified full length pspA revealed 
a .83, .58, .36 and a .27 kb fragment in all of the D39-rX1 derivatives of the family. Digesting the full length pspA with 
Hha I resulted in bands which were .76.. 47, .39, .35, and .12 kb (Figure 9 or Table 26). 

25 [0234] The stability of pspA polymorphism was also investigated using pneumococcal isolates which had previously 
been shown to be clonally related by other criteria, including capsule type, antibiotic resistance, enzyme electromorph, 
and PspA serotype. Three sets of isolates, all of which were highly penicillin resistant, were collected from patients 
during an outbreak in Hungary and two separate outbreaks in Spain. PCR amplified full length pspA from the capsular 
type 19A pneumococcal strains from the outbreak In Hungary, DB18, DB19, DB20, and DB21, resulted In a band 

30 approximately 2.0 kb. After digesting full length pspA with Hha /, four fragments were visualized., 89, .48, and .28 kb. 
Digestion with Sau3A I yielded five fragments .880, .75, .35, .34, and .10kb. Capsule type 6B pneumococcal strains, 
DB1 , DB2, DB3, and DB4, were obtained from an outbreak in Spain. Full length pspA from these strains were approx- 
imately 1 .9 kb. Digestion of the PCR-amplified fragment with Hhs I resulted in four fragments which were .83, .43, .33, 
and .28 kb. Sau3A I digestion yield a .88, .75, .34, and .10 kg fragments. DB6, DB8, and DB9, which are capsular 

35 serotype 23F strains, were Isolated from a second outbreak In Spain. DB6, DB8, and DB9 had an amplified pspA 
product which was 2.0 kb. Hha I digested fragments were .90, .52, .34, and .30 kb and Sau3A I fragments were .75, . 
52, .39, .22, .20, and .10 kb in size (Figure 10). DB7 had a 19A capsular serotype and was not identical to DB6, DB8, 
and DB9. In the D39/Rx1 family and in each of the three outbreak families the size of the fragments obtained from the 
Hha I and the SauSA I digests totaled approximately 2.0 kb which is expected if the amplified product represents a 

40 single pspA sequence. 

Diversity of RFLP pattern of amplified pspA from random pneumococcal Isolates 

[0235] PCR amplification of the pspA gene from 70 random clinical pneumococcal isolates yielded full-length pspA 
45 ranging in size from 1 .8 kb to 2.3 kb. RFLP analysis of PCR-derived pspA revealed two to six DNA fragments ranging 
in size from 100 bp to 1 .9 kb depending on the strain. The calculated sum of the fragments never exceeded the size 
of the original amplified fragment. Not all pneumococcal strains had a unique pspA, and some seemingly unrelated 
isolates from different geographical regions and different capsular types exhibited similar RFLP patterns. Isolates were 
grouped Into families based on the number of fragments produced by Hha I and SauSA I digests and the relative size 
so of these fragments. 

[0236] Based on the RFLP patterns it was possible to identify 1 7 families with four of the families containing pairs of 
subfamilies. Within families all of the restriction fragments were essentially the same regardless which restriction en- 
zyme was used. The subfamilies represent situations where two families share most but not all the restriction fragments. 
With certain strains an FRLP pattern was observed where detectable fragment size differed from the pattern of the 

6 established family by less than 100 bp. Since the differences were considered small compared to the differences in 
the fragment size and the number of fragments between families, they were not considered in family designation. The 
RFLP pattern of two isolates from six of the families is pictured in Figure 11 , Table 27. These families were completely 
independent of the capsular type or the protein type as identified by monoclonal antibodies (Table 28 and 29). 
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[0237] Previous DNA hybridization studies have demonstrated that the pspA gene of different isolates are the most 
conserved in their 3 1 region of the gene and more variable in the 5* region of the gene. Thus, ff seemed likely that the 
differences in the pspA families reflected primarily differences In the 5' end of the gene. To confirm this theory, the a 
helical and proline region of pspA was examined without the amino acid repeats. Nucleotide primers LSM1 3 and KSH2 

5 were used to amplify this fragment which is approximately 1 .6 kb. Examination of this region of pspA afforded two things. 
[0238] This primer pair permitted amplification of 90% of the strains which is greater than the 75% of the strains 
which can be amplified with oligonucleotides which amplify the full length gene. Second, it allowed Applicants to ex- 
amine if the original groupings which were based on the full length gene coincide with the fingerprint patterns obtained 
by looking at the 5' half of the gene. 

10 [0239] Figure 1 2 contains the same strains which were examined in Figure 1 1 but the PCR products were amplified 
with SKH2 and LSM13. The RFLP patterns obtained from digestion of the amplified a helical and proline rich region 
confirms the original designated families. However, these primers amplify a smaller portion of the psaA and therefore 
the difference is the families is not as dramatic as the RFLP patterns obtained from the RFLP pattern of the full length 
gene. 

13 [0240] The polymerase chain reaction has simplified the process of analyzing pspA gene and have provided a means 
of using pspA diversity to examine the epidemiology of S. pneumoniae. Because not all strains contained a unique 
fingerprint of pspA, RFLP patterns of pspA cannot be used alone to identify the donality of a strain. These results 
indicate the RFLP of PCR-amplified pspA from pneumococcal strains in conjunction with other techniques may be 
useful for identifying the clonal relatedness among pneumococcal isolates, and that this pattern is stable over long 

20 passages in vitro. 

[0241] These findings suggests that the population of pspA Is not as diverse as originally believed. PCR-RFLP of 
pspA may perhaps represent a relatively simplistic technique to quickly access the variability of the gene within a 
population. Further, these findings enable techniques to diagnose. S. pneumoniae via PCR or hybridization by primers 
on probes to regions of pspA common within groupings. 

25 [0242] The sequence studies divide the known strains into several families based on sequence homologies. Se- 
quence data demonstrates that there have been extensive recombinations occurring in nature within pspA genes. The 
net effect of the recombination Is that the "families" Identified by specific sequences differ depending upon which part 
of the pspA molecule is used for analysis. "Families" or "grouping identified by the 5' half of the alpha-helical region, 
the 3' half of the a-helical region and the proline rich region are each distinct and differ slightly from each other In 

so addition there is considerable evidence of other diversity (including base substitute and deletions and insertions in the 
sequences) among otherwise closely related molecules. 

[0243] This result indicates that it Is expected that there will be a continuum of overlapping sequences of PspAs, 
rather than a discrete set of sequences. 

[0244] The findings indicate that there is the greatest conservation of sequence in the 3' half of a a-helical region 
35 and in the immediate 5' tip. Because the diversity in the mid half of the a-helical region is greater, this region is of little 
use in predicting cross-reactivity among vaccine components and challenge strains. Thus, the sequence of 3' half of 
the alpha-helical region and the 5' tip of the coding sequence are likely to the critical sequence for predicting PspA 
cross-reactions and vaccine composition. 

[0245] The sequence of the proli ne-rich region may not be particularly important to composition of a vaccine because 
40 this region has not been shown to be able to elicit cross-protection even though it is highly conserved. The reason for 
this is presumably because antibodies to epitopes in this region are not surface exposed. 

[0246] Based on our present sequences of 27 diverse pspAs we have found that there are 4 families of the 3' half 
of the a-helical region and 2-3 families of the very 5* tip the a-helical region. Together these form 6 combinations of 
the 3* and 5* families. This approach therefore should permit us to identify a panel of pspAs with 3' and 5 helical 
45 sequences representative of the greatest number of different pspAs, See Fig. 13. 
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5 



10 



15 



Designation 


Sequence 5'-3" 


Nucleotide position 


LSM2 


GCG COT CQA CG6 CTT 
AAA CCC ATT CAC CAT 
TQQ 


1990 to 1967 


LSM1 


CCG GAT CCA GCT CCT 
OCA CCA AAA AC 


1312 to 1331 


LSM13 


OCA AGC TTA TQA TAT 
AgA ATTTTQTAA C 


1 to 26 


SKH2 


CCA CAT ACC GTTTTC 
TTQTTTCCAQCC 


1333 to 1355 



Table 25. 



20 


Amplification of pspA from a panel of 72 independent isolates* of S. pneumoniae. | 




1 TrC 


Ml IMRPR OF STRAIN'S 
EXAMINED 


1 AND 1 SM2 


~ LSM13 AND SKH2 








% OF STRAINS 


% OF STAINS AMPLIFIED 


25 






audi icicr% 
AMrLlrlfcU 






1 


3 


100 


100 




2 


1 


100 


100 








50 


87 


30 


4 


6 


67 


100 




5 


1 


100 


100 




6 


7 


29 


66 




6A 


2 


100 


100 




6B 


6 


100 


100 


35 


7 


2 


50 


100 




8 


1 


100 


100 




9V 


3 


100 


100 




9A 


2 


100 


100 


40 


9L 


1 


100 


100 




9N 


3 


100 


100 




10 


1 


100 


100 




11 


2 


50 


100 




12 


2 


0 


100 


45 


13 


1 


100 


100 




14 


4 


0 


75 




15 


2 


50 


50 




19 


5 


100 


100 


50 


22 


3 


33 


100 


23 


1 


100 


100 




33 


1 


0 


100 




35 


1 


0 


100 


RK 


nd 


3 


100 


100 



* Our strain collection contains several groups of Isolates known to be previously to be clonal and collected for that purpose. The data reported in 
the table Includes only representative Isolate from such dona) groups. 
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Table 26. 



Rx1-D39 derivatives 


ISOLATE 


SIZE OF HhA I DIGESTS 


SIZE OF Sau3A I 






DIGESTS (Kb) 


D39 


.76, .47, .39, .35, .12 


.83, .58, .36, .27 


Rx1 


.76, .47, .39, .35, .12 


.83, .58, .36, 27 


R800 


.76, .47, .39, .35, .12 


.83, .58, .36, 27 


R8 


.76, .47, .39, .35, .12 


.83, .58, .36, 27 


R61 


.76, .47, .39, .35, .12 


.83, .58, .36, .27 


R6X 


.76, .47, .39, .35, .12 


.83, .58, .36, 27 


R36NC 


.76, .47, .39, .35, .12 


.83, .58, .36, .27 


R36A 


.76, .47, .39, .35, .12 


.83, .58, .36, 27 



Table 27. 



Strain information and family designation of independent isolates. 


STRAIN 


CAPSULE TYPE 


PspA TYPE 


FAMILY 


SIZE OF Hha 1 


SIZE OF Sau3A I 










FRAGMENTS 


FRAGMENTS 


BG9163 


68 


21 


C 


1.55, .35 


1.05, .35, .22 


EF6796 


6A 


1 


C 


1.5, .35 


1.05, .35, .22 


EF5668 


4 


12 


DD 


1.25, .49, .32 


1.0, .80, .35 


EF8616A 


4 


ND 


DD 


1.25, .49, .32 


1.0, .80, .35 


EF3296 


4 


20 


E 


1.0, .40, .33 


1.15, .50, .34 


EF4135 


4 


ND 


E 


1.0, .40, .33 


1.15, .50, .34 


BG7619 


10 


ND 


F 


1.3, .40, .29, .10 


.82, .76, .35 


BG7941 


11 


ND 


F 


1.3, .40, .29, .10 


.82, .76, .35 


BG7813 


14 


8 


H 


1.05, .70, .36 


.90, .77, .35 


BG7736 


8 


ND 


H 


1.05, .70, .36 


.90, .77, .35 


AC113 


9A 


ND . 


I 


1.4, .34, .28 . 


1.2, .80 


AC99 


9V 


5 


1 


1.4, .34, .28 


1.2, .80 
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EXAMPLE 7 - ABILITY OF PspA IMMUNOQENS TO PROTECT AGAINST INDIVIDUAL CHALLENGE STRAINS 



[0247] CBA/N or BALB cJ mice were given 1 injection of 0.5 - u,g PspA In CFA, followed 2 weeks later by a boost in 
saline, and challenged between 7 and 14 (average 10) days post boost. Control mice were administered a simitar 

5 immunization regimen, except that the immunization came from an isogeneic strain unable to make PspA. The PspA 
was either full length, isolated from pneumococci or cloned full length or BC1 00 PspA, as little statistical significance 
has been seen In immunogenicity between full length PspA and BC100. The challenge doses ranged from about 10 3 
to 10* pneunocct in inoculum, but in all cases the challenge was at least 100 times LD^q. 
[0248] The results are shown in the following Tables 30 to 60, and the conclusions set forth therein. 

w [0249] From the data, it appears that an antigenic, immunological or vaccine composition can contain any two to 
seven, preferably three to five PspA, e.g., PspAs from R36A and BG9739, alone, or combined with any or all of PspAs 
from Wu2, Ef5668, and DB1 5. Note that surprisingly WU2 PspA provided better protection against D39 that did R36a/ 
Rxl/D39, and that also surprisingly PspA from Wu2 protected better against BG9739 than did PspA from BG9739. 
Combinations containing R36A, BG9739 and WU2 PspAs were most widely protective; and therefore, a preferred 

15 composition can contain any three PspA, preferably R36A, BG9739 and WU2, The data In this Example shows that 
PspA from varying strains is protective, and that it is possible to formulate protective compositions using any PspA or 
any combination of the PspAs from the eight different PspAs employed in the tests. Similarly, one can select PspAs 
on the basis of the groupings in the previous Example. Note additionally that each of PspA from R36A, BG9739, EF5668 
and DBL5 are, from the data, good for use in compositions. 

20 [0250] A note about use of medians rather than averages . Applicants have chosen to express data as median (a 
non-parametric parameter) rather than averages because the times to death do not follow a normal distribution. In fact 
there are generally two peaks. One is around day 3 or 6 when most of the mice die and the other is at > 21 for mice 
that live. Thus, it becomes nonsensical to average values like 21 or 22 with values like 3 or 6. One mouse that lives 
out of 5 has a tremendous effect on such an average but very little effect on the median. Thus, the median becomes 

25 the most robust estimator of time to death of most of the mice. 
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TABLE 36 





Best Choice for Vaccine Components as of 95/8/27 


5 




Vaccine Component (cumulative strains protected) % maximally protected 




Criterion 


1 


2 


3 


4 


5 


6 


10 


Ss#1 PspAfor 
each challenge 
strain 


R36A 

(7) 
50% 


WU2 
(10) 
71% 


BG9739* 

(11) 
79% 


EF5668 
(12) 
86% 


DBL5 
(13) 
93% 


DBL6A 
(14) 
100% 




s#2 PspA for 
each challenge 
strain 


R36A 
(12) 
86% 


BG9739 
(12) 
100% 










15 


Max score (+) 
type score 


R36A 
64% 


WU2 

(11) 
79% 


BG9739 
(13) 
92% 


DBL5 
(14) 
100% 






20 


Max Increase in 
Days alive 


R36A 

0) 
64% 


WU2 

(11) 
79% 


BG9739 
(13) 
92% 


DBL5 
(14) 
100% 






25 


% protected 


R36A 
(7) 
50% 


WU2 
(10) 
64% 


DBL5 

(11) 
79% 


EF5668 
(12) 
86% 


DBL6A 
(13) 
92% 


EF3296 
(14) 
100% 


30 


Theoretical mixture 
based on a few 
testable 
assumptions 
(see below) 


R36A 
(10) 
64% 


BG9739 
(12) 
86% 


DBL5 
(13) 
92% 


EF3296 
(14) 
100% 







This to not a unique combination. See table below. 



TABLE 37 



Combinations where all Challenge Strains have a \feccine strain with a score of > #2 


Number of PspAs in 
Combination 


Combination 


Number of #1 strains 


Total #1s 


Total #1sand #2s 


2 


R36A + BG9739 


8 


10 


20 


3 


R36A + BG9739 + WU2 


11 


15 


25 


3 


R36A + WU2 + DBL5 


11 


15 


21 


3 


R36A + WU2 + EF5668 


11 


15 


23 


3 


R36A + WU2+DBL5 


11 


15 


22 



TABLE 38 



Pooled Date for Protection against D39 by various PapAs; Days alive for each mouse 


Exp. 


LogCFU 
039 


Mice 


Days to Death/tmmunogen 








Rx1/R36A 
D39 


JD908 
(WU2) 


EF5668 


All Immune 


control 


143 


4.5 


CBA/N 






1,1,2,2,2 




1,1 £,2,3 


E145 


4.0 


CBA/N 


2,3,3,3,4 








1,1,2,3,4 
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TABLE 38 (continued) 



Pooled Date for Protection against D39 by various PapAs; Days alive for each mouse 


Exp. 


LogCFU 
039 


Mice 


Days to Death/immunogen 








Rx1/R36A 
D39 


JD908 
fWU2) 


EF5668 


Ail Immune 


control 


E028 


5.93 


BALB/c 


3.3x >21 








2,2,2,4 


P1 A3 

C 1 *KJ 










2 6 3X>10 




3 3.3.5.5 


E140 


2.81 


CBA/N 


4,4,5,7,15 








2.2.2 


E169 


2.7 


CBA/N 


2,4x>21 


2,5,3x>21 






1,2,2,2,3 


E154 


2.0 


CBA/N 


2,2,3,2x>21 








4x1, 

6x2,3,3,4 


















Ail <, 3,0 






2,3,3,3,4,4, 
4,5,7,15 * 




1,1,2.2,2 




4x1, 

6x2,3,3,4 


All 






4x2, 5X3, " 

3X4,5,7,15, 

9x>21 


2,5,3x>21 


1,1,2,2,2,2,6, 
3x>21 


1,1,9x2,5x3,3 

x4.5A6,7,15, 
15x>21 


5x1, 

16x2,6x3,4, 
4,5,5,5,>21 
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EXAMPLE 8 - ABILITY OF PspA I MM U NOG ENS TO PROTECT AGAINST INDIVIDUAL CHALLENGE STRAINS 



[0251] In Example 7 some of the capsular type 2, 4, and 5 strains were not completely protected from death by 
immunization. In these studies the BALB/cByJ mouse was used instead of the hypersusceptible, immunodeftcient CBA/ 
N mouse used for the Example 7 studies. With the BALB/cJ mouse it was observed that immunization with PspA was 
in fact able to protect against death with capsular type 2, 4, and 5 pneumococci. This result is shown in the table below. 
[0252] The data from Table 60A also demonstrates that a mixture of 4 - 5 full length PspAs was as effective, or more 
effective than immunization with a single PspA. 
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EXAMPLE 9 - CHARACTERIZATION OF PspA EPITOPES WITHIN PNEUMOCOCCAL STRAINS MC25-28 
[0253] The strains examined came from a group of 13 capsular serotype 6B strains which have been identified that 
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are members of a muttlreslstant clone, having resistance to penicillin, chloramphenicol, tetracycline, and some have 
acquired resistance to erythromycin. The pneumococcal isolates described in the following studies (MC25-28) are 
members of this 6B clone. Although previously thought to be geographically restricted to Spain (unlike the widespread 
multiresistant Spanish serotype 23F clone), members of this clone have been shown to be responsible for an increase 
s in resistance to penicillin in Iceland (Scares, S., et al., J. Infect. Dis. 1993; 168: 158-163). 

[0254] The following techniques were used to characterize the location of difference PspA epitopes: 

Bacterial cell culture 

w [0255] Bacteria were grown In Todd-Hewitt broth with 0.5% yeast extract or on blood agar plates overnight at 37°C 
In a candle Jar. Capsular serotype was confirmed by cell agglutination using Danish antisera (Statens Serumlnstitut, 
Copenhagen, Denmark). The isolates were subtyped as 6B by Quellung reaction, utilizing rabbit antisera against 6A 
or 6B capsule antigen. 

15 Bacterial lysates 

[0256] Cell lysates were prepared by incubating the bacterial cell pellet with 0.1% sodium deoxycholate, 0.01% 
sodium dodecylsufate (SDS), and 0.15 M sodium citrate, and then diluting the lysate in 0.5M Tris hydrochloride (pM 
6.8). Total pneumococcal protein in the lysates was quantftated by the bicinchoninic acid method (BCA Protein Assay 
20 Reagent; Pierce Chemical Company, Rockford, IL). 

PspA serotyping 

[0257] Pneumococcal cell lysates were subjected to SDS-PAGE, transferred to nitrocellulose membranes, and de- 
25 veloped as Western blots using a panel of seven MAbs to PspA. PspA serotypes were assigned based on the particular 
combination of MAbs with which each PspA was reactive. 

Colony immunoblotting 

30 [0258] A ten mL tube of Todd-Hewitt broth with 0.5% yeast extract was inoculated with overnight growth of MC25 
from a blood agar plate. The isolate was allowed to grow to a concentration of 1 0 7 celis/mL as determined by an O.D. 
of 0.07 at 590nm. MC25 was serially diluted and spread-plated on blood agar plates to give approximately 100 cells 
per plate. The plates were allowed to grow overnight in a candle jar, and a single blood agar plate with well-defined 
colonies was selected. Four nitrocellulose membranes were consecutively placed on the plate. Each membrane was 

35 lightly weighted and left in place for 5 min. In order to investigate the possibility of phase-variation between the two 
proteins detected on Western blots a single colony was picked from the plate, resuspended in ringer's solution, and 
spreadplated onto a blood agar plate. The membranes were developed as Western blots according to PspA serotyping 
methods. 

[0259] When the strains MC25-28 were examined with the panel of seven MAbs specific for different PspA epitopes, 
40 alt four demonstrated the same patterns of reactivity (Fig. 14). The MAbs XiR27B and 2A4 detected a PspA molecule 
with an apparent molecular weight of 190 kDa In each isolate. In accordance with the PspA serotyping system, the 
190 kDa molecule was designated as PspA type 6 because of its reactivity with XiR278 and 2A4, but none of the five 
other MAbs in the typing system. Each isolate also produced a second PspA molecule with an apparent molecular 
weight of 82 kDa. The 82 kDa PspA of each isolate was detected only with the MAb 7D2 and was designated as type 
45 34. No reactivity was detected with MAbs Xi 1 26, Xi64, 1 A4, or SR4Wr. Results from the colony immunoblotting showed 
that both PspAs were present simultaneously in these isolates under in vitro growth conditions. All colonies on the 
plate, as well as all of the progeny form a single colony, reacted with MAbs XIR278, 2A4, and 7D2. 

EXAMPLE 10 ■ SOUTHERN BLOT ANALYSIS OF CHROMOSOMAL DNA ISOLATED FROM PNEUMOCOCCAL 
50 STRAINS MC25-28 

[0260] Pneumococcal chromosomal DNA was prepared by the Youderian method (Sheffield, J.S., et al., Biotech- 
niques, 1 992; 12: 836-839). Briefly, for a 500 ml culture in THY or THY with 1 % choline, cells were centrif uged at 8000 
rpm in GSA rotor for 30 minutes at 4°C. The supernatant was decanted, and the cells were washed with 1 to 2 volumes 
S3 of sterile water to remove choline, if used. This step was only necessary when sodium deoxycholate was used. The 
wasted cells were centrifued twice a 8000 rpm in GSA rotor for 1 0 minutes. Cells were resuspended in 3.5 ml TE buffer, 
containing 1% SDS or 1% sodium deoxycholate, and incubated at 37°C for 15 minutes If sodium deoxycholate was 
used. If SDS was used, incubation at 37°C was not necessary. The cells were incubated at 65°C for 15 minutes, and 
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1/5 volume of 5.0 M potassium acetate was added, and the cell suspension was incubated for 30 minutes at 65°C. 
[0261 ] The cells were placed on ice for 60 minutes, and centrif uged at 1 2,000 rpm in an SS-34 rotor for 1 0 minutes. 
The supernatant was transferred to a clean centrifuge tube, and 2 volumes of cold 95% ethanol was added. After 
mixing, DNA was spooled on to a glass pasteur pipet, and air dried. The DNA was resuspended in 4 ml TE, and 4.0 g 
s cesium chloride was added. The solution was split into two aliquots in ultracentrffuge tubes, and the tubes were filled 
to their maximum capacity using 1 .0 gfrnl cesium chloride in TE. Before closing the tubes, 300 ml of 1 0 ug/ml ethidium 
bromide was added. 

[0262] The solution was centrifuged at 45,000 rpm overnight, or for 6 hours at 55,000 rpm. The chromosomal band 
was extracted using a gradient, at least 6 times with 1 volume each salt-saturated isopropanol. The aqueous phase 
10 was extracted by adding 2 volumes 95% ethanol. The DNA came out of solution immediately, and it was spooled on 
to a pasteur pipet The DNA pellet was washed by dipping the spooled DNA in 5 ml 70% ethanol. The DNA was air 
dried, and resuspended in the desired volume of TE, e.g., 500 ul. 

[0263] The cells were harvested, washed, lysed, and digested with 0.5% (st/vol) SDS and tOO^g/mL proteinase K 
at 37°C for 1 h. The cell wall debris, proteins, and polysaccharides were complexed with 1% hexadecyl trimethyl 
is ammonium bromide (CTAB) and 0.7M sodium chloride at 65°C for 20 mln. , and then extracted with chlorof omvlsoamyl 
alcohol. DNA was precipitated with 0.6 volumes isopropanol, washed, and resuspended in 1 0mM Tris-HCI, 1 mM EDTA, 
pH 8.0. DNA concentration was determined by spectrophotometric analysis at 260 nm (Meade, KM. et al., J. Bacterid 
1982; 149: 1 14-122; Silhavy, T.J. et al., Experiments with Gene Fusion, Cold Spring Harbor: Cold Spring Harbor Lab- 
oratory, 1984; and Murray, M.G., et al., Nucleic Acids Res. 1980; 8 4321-4325). 

20 

Probe preparation 

[0264] 5' and 3' oligonucleotide primers homologous with nucleotides to 26 and 1967 to 1 990 of Rx1 pspA (LSM13 
and LSM2, respectively) were used to ampl ify the full length pspA and construct probe LSMpspAl 3/2 from Rx1 genomic 
25 DNA. 5* and 3* oligonucleotide primers homologous to nucleotides 161 to 187 and nucleotides 1093 to 1117 (LSM12 
and LSM6, respectively) were used to amplify the variable a-helical region to construct probe LSMpspAl 2/6. PCR 
generated DNA was purified by Gene Clean (Bio101 Inc., Vista, CA) and random prime-labeled with digoxigenin- 
11-dUTP using the Genius 1 Nonradioactive DNA Labeling and Detection Kit as described by the manufacturer (Boe- 
hringer Mannheim, Indianapolis, IN). 

30 

DNA electrophoresis 

[0265] For Southern blot analysis, approximately 1 Ojig of chromosomal DNA was digested to completion with a single 
restriction endonuclease (Hind III, Kpn 1 , EcoRI, Dra i, or Pst I), then electrophoresed on a 0.7% agarose gel for 16-48 
33 h at 35 volts. For PCR analysis, 5\d. of product were incubated with a single restriction endonuciease (Bel 1 , BamH I, 
Bst I, Pst I, Sac I, EcoR I, Sma I, and Kpn I), then electrophoresed on a 1 .3% agarose gel for 2-3 h at 90 volts. In both 
cases, 1 kb DNA ladder was used for molecular weight markers (BRL, Galthersburg, MD), and gels were stained with 
ethidium bromide for 10 min and photographed with a ruler. 

40 Southern blot hybridization 

[0266] The DNA in the gel was depurinated in 0.25N HCI for 10 min, denatured in 0.5M NaOH and 1 .5M NaCl for 
30 mln, and neutralized In 0.5M Tris-HCI (pH 7.2), 1 .5M NaCl and 1 mM disodium EDTA for 30 mln. DNA was transferred 
to a nylon membrane (Micron Separations INC, MA) using a POSIBLOT pressure blotter (Stratagene, LaJolla, CA) for 

45 45 mln and fixed by UV irradiation. The membranes were prehybridized for 3 h at 42°C in 50% formamide, 5X SSC, 
5X Denhardt solution, 25m M sodium phosphate (pH 6.5), 0.5% SDS, 3% (wt/vol) dextran sulfate and 500^g/mL of 
denatured salmon sperm DNA. The membranes were then hybridized at 42°C for 18 h in a solution containing 45% 
formamide, 5X SSC, 1X Denhardt solution, 20mM sodium phosphate (pH 6.5), 0.5% SDS, 3% dextran sulfate, 250p.g/ 
mL denatured sheared salmon sperm DNA, and about 20ng of heat-denatured digoxi gen in -labeled probe DNA. After 

so hybridization, the membranes were washed twice in 0. 1 % SDS and 2X SSC for 3 min at room temperature. The mem- 
branes were washed twice to a final stringency of 0.1% SDS in 0.3X SSC at 65°C for 15 min. This procedure yielded 
a stringency greater than 95 percent. The membranes were developed using the Genius 1 Nonradioactive DNA La- 
beting and Detection Kit as described by the manufacturer (Boehringer Mannheim, Indianapolis, IN). To perform addi- 
tional hybridization with other probes, the membranes were stripped in 0.2N NaOH/0.1% SDS at 40°C for 30 min and 

55 then washed twice in 2X SSC. 
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PCR 

[0267] 5' and 3* primers homologous with the DNA encoding the N- and C-terminal ends of PspA (LSM 1 3 and LSM2, 
respectively) were used. Reactions were conducted in 50^L volumes containing 0.2mM of each dNTP, and 1 jiL of each 

s primer at a working concentration of 50mM. MgCfe was used at an optimal concentration of 1 .75mM with 0.25 units of 
Tag DNA polymerase. Ten to thirty ng of genomic DNA was added to each reaction tube. The amplification reactions 
were performed in a thermal cycler (M.J. Research, Inc.) using the following three step program: Step 1 consisted of 
a denaturing temperature of 94°C for 2 min; Step 2 consisted of 9 complete cycles of a denaturing temperature of 94°C 
for 1 mln, an annealing temperature of 50°C for 2 min, and an extension temperature of 72°C for 3 min; Step 3 cycled 

10 for 1 9 times with a denaturing temperature 94°C for 1 min, an annealing temperature of 60°C for 2 min, and an extension 
temperature of 72°C for 3 mln; and at the end of the last cycle, the samples were held at 72°C for 5 min to ensure 
complete extension. 

Band size estimation 

15 

[0268] Fragment sizes in the molecular weight standard and in the Southern blot hybridization patterns were calcu- 
lated from migration distances. The standard molecular sizes were fitted to a logarithmic regression model using Cricket 
Graph (Cricket Software, Malvern, PA). The molecular weights of the detected bands were estimated by entering the 
logarithmic line equation obtained by Cricket Graph into Microsoft Excel (Microsoft Corporation, Redmond, WA) in 

20 order to calculate molecular weights based on migration distances observed in the Southern blot. 

[0269] Since most strains contain a pspA gene and a pspC gene, it was expected that if an extra gene were present 
one might observe at least three pspA homologous loci in isolates MC25-28. In Hind III digests of MC25-28 each strain 
revealed 7.7 and 3.6 kb bands when probed with LSMpspA13/2 (Figure 1 5A and 18C). In comparison, when Rx1 DNA 
was digested with Hind III and hybridized with LSMpspA13/2, homologous sequences were detected on 9.1 and 4.2 

25 kb fragments, as expected from previous studies with PspA (Figure 1 5A). Results consistent with two psptA-homologous 
genes in MC25-28 were obtained with two pspA-homologous genes in MC25-28 digested using four additional enzymes 
(Table 61). 



Table 61. 



Chromosomal RFLPs with probe LSMpspA13/2 for isolates MC25-28 and Rx1 


Restriction Enzyme 


Strains Examined 


Restriction Fragments (sizes in kilobases) 


MC25 


MC26 


MC27 


MC28 


RX1 


MC25-2B 


Rx1 


Hind III 


-*- 


+ 






+ 


7,7, 3.6 


9.1,4.2 


Kpnl 


+ 


+ 


+ 


+ 


+ 


11.6, 10.6 


10.6, 9.8 


EcoRI 


4- 








+ 


8.4, 7.6 


7.8, 6.6 


Dral 


+ 








+ 


2.1,1.1 


1.9, 0.9 


Pstl 


4- 










>14,6.1 


10.0, 4.0 



[0270] The four isolates examined are all members of a single clone of capsular type 6B pneumococci isolated from 
Spain. These four isolates are the first in which two PspAs have been observed, i.e., PspA and PspC, based on the 
observation that bands of different molecular weights were detected by different MAbs to PspA. Mutation and immu- 
nocrtemistry studies have demonstrated that all of the different sized PspA bands from Rx1 are made of a single gene 
45 capable of encoding a 69kDa protein, supporting the assertion that two PspAs have been observed, i.e., PspA and 
PspC. 

[0271] It has been observed that probes for the 5' half of pspA (encoding the a-helical half of the protein) bind the 
pspC sequence of most strains only at a stringency of around 90%. With chromosomal digests of MC25-28, it was 
observed that the 5' Rx1 probe LSMpspA12/6 (Figure 15D) bound two pspA homologous bands at even higher strin- 

50 gency. The same probe bound only the pspA containing fragment of Rx1 at the higher stringency {Figure 15B). 

[0272] Further characterization of the pspA gene was done by RFLP analysis of PCR amplified pspA from each 
strain. Since previous studies indicated that individual strains yielded only one product, and since the amplification 
was conducted with primers based on a known pspA sequence, it was assumed that the product amplified from each 
strain represented the pspA rather than the pspC gene. When MC25-28 were subjected to this procedure, an amplified 

55 pspA product of 2.1 kb was obtained from each of the four strains. When digested with Hha I, this fragment yielded 
bands of 1.1, 0.46, 0.21 and 0.19 kb for each of the four isolates. A single isolate, MC25, was analyzed with eight 
additional enzymes. Using each restriction enzyme, the sum of the fragments was always approximately equal to the 
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size of whole pspA (Figure 1 6). These results suggested that the 2. 1 kb amplified ON A represents the amplified product 
of only a single pspA gene. Rx1 produced an amplified product of 2.0 kb and five fragments of 0.76. 0.468, 0.390, 
0.349 and 0.120 kb when digested with Hha I as expected from its known pspA sequence. 

[0273] There are several possible explanations for the observation of PspA and PspC in these strains but not in other 
5 strains. Ail isolates might make PspA and PspC In culture, but MAbs generally recognize only PspA (perhaps, in this 
isolate there has been a recombination between pspC DNA and the pspC locus, allowing that locus to make a product 
detected by MAb to PspA). Ail Isolates can have PspA and PspC, but the expression of one of them generally does 
not occur under in vitro growth conditions. The pspC locus is normally a nonfunctional pseudogene sequence that, for 
an unexplained reason, has become functional in these isolates. Results from the colony immunoblotdng of these 
10 isolates failed to show a detectable in vitro phase shift between either PspA type 6 (XIR278 and 2A4) or PspA type 34 
(7D2) protein. This strengthens the second explanation, and suggests that the second PspA is these isolates is due 
to the pspC gene not being turned off during in vitro growth conditions. 

[0274] Presumably, in these four strains, the second PspA protein is provided by the pspC DNA sequence. At high 
stringency, the probe comprising the coding region of the a-hellcal half of PspA recognized both pspA homologous 
15 sequences of MC25-1 8, but not the pspC sequence of Rx1 . The finding Indicated that the pspC sequence of MC26-28 
is more similar to the Rx1 pspA sequence than the Rx1 pspC sequence. If the pspC sequence of these strains is more 
similar to pspA than most pspC sequences, ft could explain why the products of pspC genes cannot generally be 
identified by MAbs. 

20 EXAMPLE 11 - IDENTIFICATION OF CONSERVED AND VARIABLE REGIONS OFpspA AND pspCSEQUENCES 
OF S. PNEUMONIAE 

[0275] The S. pneumoniae strains used in this study are listed in Table 62. The strains are human clinical isolates 
representing 12 capsular and PspA serotypes. Ail strains were grown at 37°C in 100ml of Todd-Hswitt broth supple- 
25 merited with 0.5% yeast extract to an approximate density of 5 x 1 0 8 cells/ml. After harvesting of the cells be centrif- 
ugation (2900 g, 10m in), the DNA was isolated, and stored at 4°C in TE (10 mM Tris, 1mM EDTA, pH8.0). 



Table 62. 



Streptococcus pneumoniae strains used. 


Strain 


Relevant phenotype 


Reference 


WU2 


capsular type 3, PspA type 1 


Britesetal., 1981 


D39 


Capsular type 2, PspA type 25 


Avery etaL, 1944 


R36A 


Nonencapsuated mutant of D39, 


Avery etal., 1944 




pspA type 25 




Rx1 


Derivative of R36A, PspA type 25 


Shoemaker and Guild, 1974 


DBL5 


Capsular type 5, PspA type 33 


Yother etal., 1986 


DBL6A 


capsular type 6A, PapA type 19 


Yotheretal., 1986 


A66 


Capsular type 3, PspA type 13 


Avery eta)., 1944 


AC94 


Capsular type 9L, PspA type 0 


Waltman etal., 1992 


AC17 


Capsular type 9L, PspA type 0 


Waltman etal., 1992 


AC40 


Capsular type 9L, PapA type 0 


Waltman etal., 1992 


AC107 


Capsular type 9V, PapA type 0 


Waltman etal., 1992 


AC100 


Capsular type 9V, PspA type 0 


Waltman et at., 1992 


AC140 


capsular type 9N, PapA type 18 


Waltman etal., 1992 


D109-1B 


Capsular type 23, PspA type 12 


McDanlel et al., 1992 


BG9709 


Capsular type 9, PspA type 0 


McDanlel etal., 1992 


L81905 


Capsular type 4, PspA type 25 


McDaniel et al., 1992 


L82233 


Capsular type 14, PspA type 0 


McDanlel et al., 1992 


L82006 


capsular type 1 , PspA type 0 


McDaniel etal., 1992 



[0276] Approximately 5ng of chromosomal DNA was digested with H/ndlll according to the manufacturer's Instruc- 
tions (Promega, Inc., Madison, Wl). The digested DNA was subjected to electrophoresis at 35 mV overnight in 0.8% 
agarose gels and then vacuum-blotted onto Nytran® membranes (Schleicher & Schuell, Keene, NH). 
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[0277] The oligonucleotides uses were based on the previously determined sequence of Rx1 pspA. Their position 
and orientation rotative to the structural domains of Rx1 pspA are shown in Figure 17. Labeling of oligonuqleotides 
and detection of probe-target hybrids were both performed with the Genius System® according to manufacturer's 
instructions (Boehringer-Mannhein, Indianapolis, IN). All hybridizations were done for 1 8 hours at 42°C without forma- 
5 mide. By assuming that 1% base-pair mismatching results in a 1 °C decrease in T m arbitrary designations of "high* and 
"low" stringency were defined by salt concentration and temperature of post-hybridization washes. Homology between 
probe and target sequences was derived using calculated T m by established methods. High stringency Is defined as 
£ 90%, and low stringency is £ 85% base-pair matching. ~ 

[0278] PCR primers, which were also used as oligonucleotide probes in Southern blotting and hybridizations, were 
10 designed based on the sequence of pspA from pneumococcal strain Rx1 . These oligonucleotides were synthesized 
by Oligos, Etc. (Wilson, OR), and are listed in Table 63. 



15 



25 



Table 63. 


Oligonucleotide sequences* 


Primer 


5' -> 3 # 


LSM111 


ca&ATcaurcrccra^^ 


LSM2 


GCCCGTCGACXJCTTAAACCCATTCACGATTCG 


LSM3 


COX2AXCCTGAGCGAGAGCAGTTGGCTO 


LSM4 


CCGGATCCGCTCAAAGAGATTGATGAGTCTG 


LSHS 


GCGGATCCCGTAGCCAGTGAGTCTAAAGCTO 


LSM6 


CTGAGTCGACTGGAGTTTCTGGACCTGGAGC 


LSH7 


CCGGATCCAGCTCGAGCTCCAGAAACTCCAG 


I*SM9 


GTTTTTGGTGCAGGAGCTOG 


LSH10 


GCTATGGCTACAGGTTG 


IftSH12 


CCGGATCCAGCGTGCCTATCTTAGGGG CTGGT 


LSM112 


GCGGATCCTTGACCAATARRRACGGAGGAGGC 



35 

[0279] PCR was done with an MJ Research, Inc., Programmable Thermal Cycler (Watertown, MA), using approxi- 
mately 10 ng of genomic pneumococcal DNA as template with designated 5' and 3' primer pairs. The sample was 
brought to a total volume of 50 pJ containing a final concentration of 50 mM KC1 , 10 mM Tris-HCI (pH 8.3), 1 .5mM 
MgCfe, 0.01% gelatin, 0.5 \tM of each primer, 200 |iM of each deoxynucleoside triphosphate, and 2.5 U of Taq DNA 
40 polymerase. The samples were denatured at 94°C for 2 minutes and subjected to 10 cycles consisting of: 1 min at 
94°C, 2 min at 50°C t and 3 min at 72°C, followed by 20 cycles of: 1 min at 94 Q C, 2 min at 60°C, and 3 min at 72°C. 
After 30 total cycles, the samples were held at 72°C for an additional 5 min prior to cooling to 4°C. The amplicons were 
then analyzed by agarose gel electrophoresis. 

[0280] Oligonucleosides were used to probe Hindi H digests of DNA from 18 strains of S. pneumoniae under condi- 
45 tions of low and high stringency. Each strain was also screened using a full-length pspA probe. Table 64 summarizes 
the results for each strain under conditions of high stringency. Strain Rx1 is a laboratory derivative of the clinical isolate 
D39 and consequently, both strains showed identical hybridization patterns and are a single column in Table 64. 
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[0281] The only strain which did not have more than one pspA-homologous Hindtll fragment was WU2, which was 
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previously shown using a full-length pspA probe. Even at high stringency, six of the eight probes detected more than 
one fragment in at least one of the 1 8 strains (Table 64). LSM7, 1 0 and 1 2 hybridized with two fragments in more than 
one-hatf of the strains, and the fragments detected by the oligonucteoti de probes were identical in size to those detected 
by the full-length pspA probe. Moreover, the same pairs of fragments were frequently detected by probes derived from 

5 the 3* as weil as the 5' region of Rx1 pspA. These results suggested that the Hind\\\ fragments from different isolates 
include two separate but homologous sequences, rather than fragments of a single pspA gene. Based on the diversity 
of the hybridization patterns and the size of restriction fragments, it is clear that pspA and pspC sequences are highly 
diverse and that these loci have considerable sequence variability as determined by location of HindiU recognition sites. 
[0262] Oligonucleotides which hybridize with a single restriction fragments in each strain were assumed to be specific 

10 for pspA At high stringency, LSM3 and LSM4 detected only a single HindiU fragment in the strains with which they 
reacted. Restriction fragments containing homology to LSM3 or LSM4 were the same as those which hybridize with 
all of the other homologous probes. This suggested that LSM3 and LSM4 specifically detect pspA rather than the pspC 
sequence. That LSM3 hybridizes with a single restriction fragment of WU2 further confirmed that this oligonucleotide 
is specific for pspA. Sequences from the portion of the gene encoding the second proline region (LSM1 11 ) and the C- 

is terminus (LSM2) appeared to be relatively specific f orpspA since they generally detect only one of the HindiU fragments 
of each strain. 

[0283] Oligonucleotides LSM1 2 and LSM10 were able to detect the most conserved epitopes of pspA and generally 
hybridize with multiple restriction fragments of each strain (Table 65). LSM7 was not as broadly cross-reactive, but 
detected two pspAs In 41% of strains including almost 60% of the strains with which it reacts. Thus, sequences rep- 

20 resenting the leader, first proline region, and the repeat region appear to be relatively conserved not only within pspA 
but between the pspA and pspC sequences. LSM3, 4, and 5 hybridize with the smallest number of strains of any 
oligonucleotides (29-35 percent), suggesting that the a-hellca! domain is the least conserved region within pspA. In 
strains BG58C and L81905 oligonucleotides detect more than two HlndiW fragments containing sequences with ho- 
mology to pspA, Because of the absence of HindiU restriction sites within any of the oligonucleotides It was unlikely 

25 that these multiple fragments result from the digestion of chromosomal DNA within the target regions. Also, the addi- 
tional restriction fragments were detected at high stringency by more than one oligonucleotide. Possibly, in these two 
strains, there are three or four sequences with DNA homology to some portions of pspA. The probes most consistently 
reactive with these additional sequences are those for the leader, the alpha-helical region, and the proline-rich region. 
[0284] The oligonucleotides used as hybridization probes were also tested for th eir utility as primers In the polymerase 

30 chain reaction (PCR). Amplification of pspA from 1 4 strains of S. pneumoniae comprising 12 different capsular types 
was attempted with the primers listed in Table 63. LSM2, derived from the 3* end of pspA, were able to amplify an 
apparent pspA sequence from each of 14 pneumococcal strains when used in combination with LSM111, which is 
within the sequence of pspA encoding the proline-rich region. Combinations of LSM2 with primers upstream in pspA 
were variably successful in amplifying sequences (Table 65). The lowest frequency of amplification was observed with 

35 LSM112 which was derived from the Rx1 sequence 5' to the pspA start site. This oligonucleotide was not used in the 
hybridization studies. DNA fragments generated by PCR were blotted and hybridized with a full-length pspA probe to 
confirm homology to pspA. 

[0285] Further evidence for variability at the pspA locus comes from the differences in the sizes of the amplified pspA 
gene. When PCR primers LSM12 and LSM2 were used to amplify the entire coding region of PspA, PCR products 
40 from different pneumococcal isolates ranged in size from 1 .9 to 2.3 kbp. The regions of pspA which encode the ot- 
heltcal, proline-rich, and repeat domains were amplified from corresponding strains and variation in pspA appears to 
come from sequences within the a-helical coding region. 



Table 65. 



Amplification of pspA by PCR using the Indicated oligonucleotides as 5' primers in combination wrth the 3* 




- primer LSM2. 




5* - primer 


Domain 


Amplified/ Tested 


Percent Amplified 


LSM112 


-35 (upstream) 


2/14 


14 


LSM12 


leader 


8/14 


57 


LSM3 


a-helical 


3/14 


21 


LSM7 


proline 


12/14 


86 


LSM111 


proline 


14/14 


100 



[0286] These studies have provided a finer resolution map of the location of conserved and variable sequences 
within pspA. Additionally, regions of divergence and identity between pspA and the pspC sequences have been iden- 
tified. This data confirmed serological studies, and demonstrated that pspA and pspC sequences are highly variable 
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at the DNA sequence level. The diversity of HindW restriction fragment polymorphisms contained pspA and the pspC 
sequence supported earlier data using larger probes that detected extensive variability of the DNA in and around these 
sequences. 

[0287] A useful pspA-specific DNA probe would identify Rx1 and WU2 pspA genes, in which restriction maps are 
5 known, and would identify only a single restriction fragment in most strains. Two probes, LSM3 and LSM4, do not 
hybridize with more than one HindU restriction fragment in any strain of pneumococcus. Both of these oligonucleotides 
hybridize with Rx1 pspA and LSM3 hybridizes with WU2 pspA. However, each of these probes hybridize with only four 
of the other 15 strains. When these probes identify a fragment, however, it is generally also detected by all other 
Rx1 -derived probes. Oligonucleotides from the second proiine-rich region (LSM 1 1 1 ) and the C-terminus of pspA (LSM2) 
10 generally identify only one pspA-homologous sequence at high stringency. Collectively, LSM 11 1,2,3 and 4 react with 
16 of the 1 7 isolates and in each case revealed a consensus DNA fragment recognized by most or ail of the oligonu- 
cleotide probes. 

[0288] When an oligonucleotide probe detected only a single DNA fragment it was presumed to be pspA. If the probe 
detected multiple fragments, it was presumed to hybridize with pspA. If the probe detected multiple fragments, it was 

is presumed to hybridize with pspA and the pspC sequence. Based on these assumptions the most variable portion 
between pspA and pspC is the region immediately upstream from the -35 promoter region and that portion encoding 
the a-heiical region. The most conserved portion between pspA and pspC was found to be the repeat region, the leader 
and the proline-nch region sequences. Although only one probe from within the repeat region was used, the high degree 
of conservation among the 1 0 repeats in the Rx1 sequence makes it likely that other probes within the repeat sequences 

20 would give similar results. 

[0289] The portion of Rx1 pspA most similar to the pspCsequence was that encoding the leader peptide, the upstream 
portion of the proline-rich region, and the repeat region. The repeat region of PspA has been shown to be involved in 
the attachment of this protein to the pneumococcal cell surface. The conservation of the repeat region within pspC 
sequences suggests that if these loci encode a protein, it may have a similar functional attachment domain. The con- 

25 servation of the leader sequence between pspA and the pspC sequence was also not surprising since similar conser- 
vation has been reported for the leader sequence of other proteins from gram positive organisms, such as M protein 
of group A streptococci (Haanes-Fritz, E. et al. t Nud. Acids Res. 1988; 16: 4667-4677). 

[0290] In two strain, some oligonucleotide probes identified more than two pspA-homologous sequences. In these 
strains, there was a predominant sequence recognized by almost all of the probes, and two or three additional se- 
30 quences share homology with DNA encoding the leader, a-hellcal, and proline region, and they have no homology with 
sequences encoding the repeat region in the C-termlnus of PspA. These sequences might serve as cassettes which 
can recomblne with pspA and/or the pspC sequences to generate antigenic diversity. Alternatively, the sequences 
might encode proteins with very different C-terminal regions and might not be surface attached by the mechanism of 
PspA. 

35 [0291 ] Oligonucleotides which hybridize with a single chromosomal DNA fragment were used as primers in PGR to 
examine the variability of domains within pspA. These results demonstrate that full-length pspA varies in size among 
strains of pneumococci, and that this variability is almost exclusively the result of sequences in the alpha-helix coding 
region. 

40 EXAMPLE 12 - CLONING OF PspC 

[0292] Chromosomal DNA from S. pneumoniae EF6796, serotype 6A clinical isolate, was isolated by methods in- 
cluding purification through a cesium chloride gradient, as described in Example 8. The Hfodll-EcoRI fragment of 
EF6796 was cloned in modified pZero vector (Invitrogen, San Diego, CA) in which the Zeocin-resistance cassette was 
45 replaced by a kanamycin cassette (shown in Figure 18). Recombinant plasmids were electroporated into Escherichia 
CO//TOP1 OP cells [F {tacOTetR} mcrA A{mrr-hsdF\h/\S-mcrBC) <|60/acZAM1 5 A/acX74 <feoR recA 1 a/aOl 39 A(ara-leu) 
7967 gaA) gaK rpsL endAI nupG] (Invitrogen). 

[0293] The 5* region of pspA.Rxl does not hybridize to pspC sequence at high stringencies by Southern analysis. 
Utilizing both the full-length Rx1 pspA probe, and a probe containing the sequence encoding a-helical region of PspA, 

so it was possible to identify which DNA fragment contained pspA and which fragment contained the pspC locus. The 
pspC locus and the pspA gene of EF6796 were mapped using restriction enzymes. After digestion of chromosomal 
DNA with HindiU, the pspC locus was localized to a fragment of approximately 6.8 kb. Following a double digest with 
HindiW and EcoRI, the pspC locus was located in a 3.5 kb fragment. To obtain the Intact pspC gene of EF6796, chro- 
mosomal DNA was digested with HindiW, separated by agarose gel electrophoresis, the region between 6 and 7.5 kb 

53 purified, and subsequently digested with EcoRI. This digested DNA was analyzed by electrophoresis, and DNA frag- 
ments of 3.0 to 4.0 kb were purified (GeneClean, Bio101 , Inc., Vista, CA). The size-fractionated DNA was then ligated 
In H/ndlll-EcoRI-digestedpZero, and electroplated into E. co//TOP1 OF cells. Kanamycln-reslstant transf ormants were 
screened by colony blots and probed with full-length pspA. A transformant, LXS200, contained a vector with a 3.5 kb 
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insert which hybridized to pspA. 

[0294] Sequencing of pspC in pLXS200 was completed using automated DNA sequencing on an ABI 377 (Applied 
Biosystems, Inc., PLACE). Sequence analyses were performed using the University Of Wisconsin Genetics Computer 
Group (GCG) programs supported by the Center for AIDS Research (P30 Al 27767), MacVector 5.0, Sequencer 2.1 , 
5 and DNA Stridor programs. Sequence similarities of pspC were determined using the NCBI BLAST server. The colled- 
coil structure predicted by pspC sequence was analyzed using Matcher. 

A gene probe for cloning the pspC locus 

io [0295] Two oligonucleotide primers, N1 92 and C558 (shown in Figure 1 9), have been used previously to done frag- 
ments homologous to the region of Rx1 pspA encoding amino acids 1 92-588 from various pneumococcal strains. 
These primers are modifications (altered restriction sites) of LSM4 and LSM2 which were previously shown to amplify 
DNA encoding the C-terminaJ 396 amino acids of PspA.Rxl (Figure 17); this Includes approximately 100 amino acids 
of the a-helical region, the proline rich region, and the C-terminal choline-binding repeat region. Using primers N192 

is and C558, a 1 .2 kb fragment from strain EF6796 was amplified by PC R, and subsequently cloned in pET-9A (design ated 
. PRCT135). This insert was then partially sequenced. 

[0296] Independently, a larger pspA fragment from strain EF6796 was made using primers LSM1 3 and SKM2 (shown 
in Figure 1 9) for the purpose of direct sequencing of serologically diverse pspA genes. 

[0297] The LSM13 and SKH2 primer pair result in the amplification of the 5' end of most pspA gene(s) encoding the 
20 upstream promoter, the leader peptide, the a-helical, and the proline-rich regions (amino acid -15 to 450) (Figure 20). 
From the strain EF6796, the LSM13 and SKH2 primers amplified a 1 .3 kb fragment (pspAEF6796) t which was se- 
quenced. The sequence from pRCT135 and the LSM13/SKH2 PCR-generated fragment pspA.EF6796 was not iden- 
tical. The fragment obtained by PCR using primers LSM1 3 and SKH2 was designated pspA based on Its location within 
the same chromosomal location as pspA.Rxl. The cloned fragment in pRCT135 was assumed to represent the se- 
25 quence of the second gene locus, pspC, known to be present from Southern analysis. Both genes have significant 
similarity to the corresponding regions of the prototype pspA gene from strain Rx1 . The second gene locus was called 
pspC t in recognition of Its distinct chromosomal location, not sequence differences from the prototype pspA gene. 

Analysis of the nucleotide and amino acid sequence of pspC EF6796 

30 

[0298] To test the hypothesis that pRCT1 35 represented pspC of EF6796, and to further investigate pspC, the entire 
EF6796 pspC gene was cloned as a 3.4 kb Hindi\\-EcoR1 fragment forming pLXS200. DNA sequence of the pspC- 
containlng clone pLXS200 revealed an open reading fram of 2782 nucleotides based on the analysis of putative tran- 
scriptional and translation start and stop sites (Figure 21 ). The predicted open reading frame encodes a 1 05 kDa protein 

35 which has an estimated pi of 6.09. 

[0299] PspA.Rxl and PspC.EF6796 are similar in that they both contain an a-helical region followed by a proline- 
rich domain and repeat region (Figure 20). However, there are several features of the amino acid sequence of PspC 
which are quite distinct from PspA. From comparisons at the nucleotide as well as the predicted amino acid sequence, 
it is apparent that the region of strong homology between PspC and PspA begins at amino acid 458 of PspC (amino 

40 acid 147 of PspA) and extends to the C-tenminus of both proteins (positions 899 and 588 respectively). The predicted 
amino acid sequence of PspC.EF6796 and PspA.Rxl are 76% similar and 68% identical based on GCG Bestfit program 
for this region (Figure 22). The nucleotide sequence identity between pspC and pspA is 87% for the same region. Eight 
bases upstream of the ATG start site is putative ribosomal binding site, TAGAAGGA. The proposed transcriptional start 
-35 (TATACA) and -10 (TATAGT) regions are located between 258 to 263 and 280 to 285, respectively (Figure 21). A 

45 potential transcriptional terminator occurs at a stem loop between nucleotides 3237 through 3287. The putative signal 
sequence of PspC is typical of other gram positive bacteria. This region consists of a charged region followed by a 
hydrophobic core of amino acids. A potential cleavage site of the signal peptide occurs at amino acid 37 following the 
Val-His-Ala. The first amino acid of the mature protein is a Glu residue. 

[0300] Other than features similar to alt signal sequences, there is no homology in this region between pspA and 
50 pspC. This confirms that pspC is present In a separate chromosomal locus from that of pspA. The signal sequence 
and upstream region have striking similarity to the similar regions of S. agatactiae p antigen (accession number 
X59771). The Bantigen of Group B streptococci is a cell surface receptor that binds IgA. Similarity to the bac gene 
ends with the start of the mature protein of PspC, and the nucleotides are 75% identical in this region. Thus, although 
pspC is in a very similar chromosomal locus to the p antigen, it is clearly a distinct protein. 
55 [0301 ] The N-terminus of PspC is quite different from the N-terminus of PspA. Prediction of the secondary structure 
utilizing Chou-Fausman analysis (Chao, P.Y. et at., Adv. Enzymot. Relat Areas Mol. Biol. 197B: 47: 45-148), suggests 
that the structure of amino acids 16 to 589 of PspC is predominately a-heiical. The Matcher program was used to 
examine periodicity in the a-helical region of PspA. The characteristic seven residue periodicity is maintained by having 



106 



EP 1 477 185 A2 



hydrophobic residues at the first and fourth positions (a and d) and hydrophobic residues at the remaining positions. 
The coiled-coil region of the a-helix of PspC (between amino acid 32 to 600) has three breaks in the heptad repeat 
(Figure 23). These disturbances in the 7 residue periodicity occur at amino acids 99 to 1 04, 224 to 267 and 346 to 350. 
The a-helical region of PspA has seven breaks in the motif, each break ranging from a few amino acids to 23 amino 

s acids each. In contrast the three breaks in the coiled-coil motif of PspC involve 5, 43 and 4 amino acids, respectively. 
[0302] The sequence encoding the ct-helical region of PspC contains two direct repeats 483 nucleotides (1 60 amino 
acids) long which are 88% percent identical at the nucleotide level. These repeats, which occur between nucleotides 
562 to 1045 and nucleotides 1312 to 1795, are conserved both at the nucleotide and amino acid level (amino acids 
1 88 to 348 and 438 to 598) (Rgure 24). PspA lacks evidence for any repeats this prominent within the ct-helical region. 

10 These repeat regions could provide a mechanism for recombination that could alter the N-terminal half of the PspC 
molecule. Although repeat motifs are common in bacterial surface proteins, a direct repeat this large or separated by 
a large spacer region Is novel. The evolutionary significance of this region is not known. A Blast search of the repeat 
region and the 267 nucleotide bases between them revealed no sequence with significant homology at the nucleotide 
or amino acid level. However, one of the structural breaks in the coil-coiled region of PspC is the region between the 

» two repeats. Perhaps some deviation from coiled-coil structure between the two repeats is critical to maintain the ct- 
helical structure. 

[0303] Previous studies have shown that a major cross-protective region of PspA comprises the C-termlnal 1/3 of 
the ct-helical region (between residues 1 92 and 260 of PspA.Rxl ). This region accounts for the binding of 4 of 5 cross- 
protective immunity in mice. Homology between PspC and PspA begins at amino acid 148 of PspA, thus including the 

20 region from 192 - 299. The homology between PspA and the PspC includes the entire PspC sequence C-terminal of 
amino acid 486. Based on the fact that PspA and PspC are so similar In this region known to be protection-eliciting, 
PspC is also likely to be a protection-eliciting molecule. Because of close sequence and conformational similarity of 
the proteins in this region, antibodies specific for the region of PspA between amino acid 148 and 299 should cross- 
react with PspC and thus afford protection by reacting with PspC and PspA. Likewise, immunization with the PspC 

25 would be expected to elicit antibodies cross-protective against PspA. The differences between PspC of strain EF6796 
and PspA of strain Rx1 is no greater than the differences between many additional PspAs, which have been shown to 
be highly cross-protective. 

[0304] A proline-rich domain exists between amino acid 590 to 652. The sequence, PAPAPEK, is repeated six times 
in this region. This region is very similar to the proline-rich region of PspA.Rxl which contains the sequence PAPAP 
30 repeated eight times in two proline-rich regions. These two regions of PspA.Rxl are separated by 27 charge amino 
acids; no such spacer region is present in PspC. 

[0305] Many cell surface proteins of other gram positive bacteria contain proline-rich regions. These are often as- 
sociated with a domain of protein that is predicted to be near the cell wall murein layer when the protein is cell-asso- 
ciated. For example, in M proteins of S. pyogenesthis domain contains both a Pro- and Gty- rich regions. The f ibronectin- 

35 binding protein of S. pyogenes, 5. dysgalactiae, and Staphylococcus au/euscontains a proline-rich region with a three- 
residue periodicity (pro-charged-uncharged) that is not found in PspA or PspC. An M-like protein of S. equl contains 
a proline-rich region that is comprised of the tetrapeptide PEPK. This region lacks glycine normally found in the proline 
regions of M-proteins. The last proline repeat region of this molecule is PAPAK, which is more similar to the proline- 
reglon of PspA and PspC than it is to M-proteins. 

40 [0306] Proline-rich regions of gram positive bacterial proteins have been reported previously to transit the cell wall. 
The differences in proline-rich regions of proteins from diverse bacteria may reflect differences In protein function or 
possibly subtle differences in cell wall function. Proline-rich regions are thought to be responsible for aberrant migration 
of these proteins through SDS-poly aery lam ide gets. 

[0307] The repeat region of PspC is a common motif found among several proteins in gram positive organisms. 

45 Autofysln of S. pneumoniae, toxins A and B of Clostridium difficile, ghjcosyttmnsferases from S. downei and S. mutans, 
and CspA of C. acetobiityitcum all contain similar regions. In PspA these repeats are responsible for binding to the 
phosphatidylcholine of tetchoic acid and lipoteichoic acid in cell wall of pneumococci. However, bacterial proteins con- 
taining C-terminal repeats are secreted, which may imply either a lost or gained function. Although all of these proteins 
have similar repeat regions the similarity of the repeat regions of PspA and PspC Is much greater than that of PspC 

so to the other proteins (Table 66). 

[0308] Interestingly, PspC like PspA has a 17 amino acid partially hydrophobic tail. The function of this 17 amino 
acid region is unknown. In the case of PspA ft has been shown that mutants lacking the tail bind the surface of pneu- 
mococci as well as PspAs in which the tail is expressed. Presently, it is now known whether PspC is attached to the 
cell surface or secreted. 

55 [0309] PspA and PspC proteins both have a-helical coiled-coil regions, proline-rich central regions, repeat regions, 
with a choline binding motifs, and the C-terminal 17 amino acid tail. PspA and PspC share three regions of high se- 
quence identity. One of these is a protection-eliciting region present within the a-helical domain. The other two regions 
are the proline-rich domain and a repeat domain shared with other choline binding proteins and thought to play a role 
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In cell surface association. The similarity throughout most of the structure of the PspA and the PspC molecules raises 
the possibility that the two molecules may play at least slightly redundant functions. However, the fact that the N- 
terminal half of the protein is not homologous to any of the ct-helical sequence of PspA suggests the PspC and PspA 
may have evolved for at least somewhat different roles on the cell surface. One of the most striking differences between 

3 the two molecules is the single repeat in the a-helical region of PspC. Although neither the exact function of PspA nor 
of PspC are known, the observation that a major cross-protective region of PspA is highly homologous with a similar 
region of PspC, raises the possibility that both molecules are protection-eliciting and elicit cross-protective antibodies. 
[0310] The sequence similarity between the promoter region of the pspC gene and the bac gene from group B 
streptococci is very intriguing. It implies that an Interspecies recombination event has occurred and, this interspecies 

10 recombination has contributed to the evolution of the pspC. The pspCgene thus has a chimeric structure, being partially 
like pspA and partially like the 6 antigen. In the latter case, all protein similarity is limited to the signal sequence. Similar 
interspecies recombination events have contributed to the evolution of the genes encoding penicillin binding protein. 
[0311] Using analogous procedures, a second PspC sequence was isolated from strain D39 of 5. pneumoniae. 
Figures 25 to 29 show the sequence data of PspC from strain D39, complete from upstream of the promoter through 

f 5 the proline-rich region . Strain D39 has the same genetic background as strains Rx1 , from which pspA was sequenced. 
D39 and Rx1 have the same pspC gene based on Southern blot analysis. 

[0312] The alpha-helical encoding region of the D39 pspC gene is one third of the size of the homologous region 
from the EF6796 pspC gene. The proline-rich region of the D39 pspC gene was more similar to Rx1 pspA than to 
EF6796 pspC. Even so, the two pspC genes were 86% Identical at the nucleotide sequence, and 67% identical at the 
20 amino acid level. 

[0313] In the alpha-helical sequence of EF6797 pspC a strong repeat was observed. This was absent in the pspC 
sequence of D39. The D39 pspC sequence also lacks a leader sequence, found in the EF6797 pspC sequence. 
[0314] This data strongly indicates that there is variability in the structure of pspC t similar to previous observations 
for pspA. In the case of pspC, however, the extent of variability appears to be even greater than that which has been 
25 observed for pspA. 



Table 66. 

PERCENT HOMOLOGY OF CHOLINE BINDING REGIONS 









Percent 
similarity/identity 


Protein 


organism 


PapA 


PspC 


PspC 


SL pneumoniae 


86/60 


100/100 


Bacteriophage Cp-1 


S. pneumoniae 


56/30 


56/28 


LytA 


S. pneumoniae 


57/33 


61/32 


PapA 


C. perfringens 


64/45 


59/42 


alpha toxin 


C. novyi 


54/29 


57/33 


CspB 


C. acetobutyticum 


58/36 


61/45 



40 

[031 5] Having thus described in detail certain preferred embodiments of the present invention, it is to be understood 
that the invention defined by the appended claims is not to be limited by particular details set forth in the above de- 
scription, as many apparent variations thereof are possible without departing from the spirit or scope thereof. 
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SUMMARY 

[0317] The present invention is now summarised by way of the following numbered paragraphs: 

1 . An isolated amino acid molecule consisting of residues 1 to 115, 1 to 260, 192 to 588, 192 to 299, or residues 
1 92 to 260 of pneumococcal surface protein A of Streptococcus pneumoniae, 

2. An isolated DNA molecule consisting of a fragment of pneumococcal surface protein A gene of Streptococcus 
pneumonia encoding the isolated amino acid molecule of paragraph 1 . 

3. A PCR primer consisting essentially of the isolated DNA molecule of paragraph 2. 

4. A hybridization probe consisting essentially of the isolated DNA molecule of paragraph 2. 

5. An Immunological composition comprising the amino acid molecule of paragraph 1 . 

6. An isolated DNA molecule consisting of nucleotides 1 to 26, 1 967 to 1 990, 1 61 to 1 87, 1 093 to 1 1 1 7 or 1 31 2 to 
1331 , or 1333 to 1355 of a pneumococcal surface protein A gene of Streptococcus pneumoniae 

7. A PCR primer consisting essentially of the Isolated DNA molecule of paragraph 6. 

8. A hybridization probe consisting essentially of the isolated DNA molecule of paragraph 6. 

9. An Isolated DNA molecule consisting of a fragment of a pneumococcal surface protein A gene of Steptococcus 
pneumoniae consisting of a nucleotide sequence (5* to 3') selected from 

CCGGATCCAGCTCCTGCACCAAAAAC; 

GCGCGTCGACGGCTTAAACCCATTCACCATTGG; 

CCGGATCCTGAGCCAGAGCAGTTGGCTG; 

CCGGATCCGCTCAAAGAGATTGATGAGTCTG; 

GCGGATCCCGTAGCCAGTCAGTCTAAAGCTG; 

CTGAGTCGACTGGAGTTTCTGGAGCTGGAGC; 

CCGGATCCAGCTCCAGCTCCAGAAACTCCAG; 
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GCGGATCCTTGACCAATATTTACGGAGGAGGC; 
GTTTTTGGTGCAGGAGCTGG; 
GCTATGGGCTACAGGTTG; 
CCACCTGTAGCCATAGC; 

CCGCATCCAGCGTGCCTATCTTAGGGGCTGGTT; and 
GCAAGCTTATGATATAGAAATTTGTAAC. 

1 0. A PCR primer consisting essentially of at least one isolated DN A molecule of paragraph 9. 

11 . A hybridization probe consisting essentially of at least one isolated DNA molecule of paragraph 9. 

12. PCR probe(s) which distinguishes between pspA and pspA-like nucleotide sequences. 

13. PCR probe(s) which hybridizes to both pspA and pspA-like nucleotide sequences. 

14. A PspA extract prepared by a process comprising growing pneumococci in a first medium containing choline 
chloride, elutlng live pneumococci with a choline chloride containing salt solution, and growing the pneumococci 
In a second medium containing an alkanolamine and substantially no choline. 

15. A PspA extract prepared by growing pneumococci in a first medium containing choline chloride, eluting live 
pneumococci with a choline chloride containing salt solution, growing the pneumococci in a second medium con- 
taining an alkanolamine and substantially no choline, and purifying PspA by isolation on a choline-Sepharose 
affinity column. 

16. An Immunological composition comprising the extract of paragraph 14. 

17. An immunological composition comprising the extract of paragraph 15. 

1 8. An immunological composition comprising full length PspA. 

19. A method for enhancing immunogenicity of a PspA containing immunological composition comprising including 
in said composition the C-terminal portion of PspA. 

20. An immunological composition comprising at least two PspAs. 

21 . The immunological composition of paragraph 20 wherein the PspAs are from different groups based on RFLP. 

22. PCR amplification product from a primer as described in paragraphs 3, 7, 10, 12 or 13. 

23. An Isolated DNA molecule consisting of a nucleotide sequence homologous to a portion of pspA. 

24. An isolated amino acid molecule comprising pneumococcal surface protein C, PspC, of Streptococcus pneu- 
moniae having alpha-helical, proline rich and repeat regions. 

25. An isolated DNA molecule comprising a pneumococcal surface protein C gene of S. pneumoniae encoding 
the isolated amino acid molecule of paragraph 24. 

26. A PCR primer consisting essentially of the isolated DNA molecule of paragraph 25. 

27. A hybridization probe consisting essentially of the isolated DNA molecule of paragraph 25. 

28. An immunological composition comprising the amino acid molecule of paragraph 24. 
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29. An isolated amino add molecule of paragraph 24 having strong homology with pneumococcal surface protein 
A, PspA, of S. pneumoniaehom amino acid 458 of PspC, corresponding to amino acid 147 of PspA, extending to 
a C-terminus of PspC and PspA. 

s 30. An isolated amino acid molecule of paragraph 24, further comprising a signal sequence consisting essentially 

of a charged region followed by a hydrophobic core of amino acids. 

31 . An isolated amino acid molecule of paragraph 24, wherein the atpha-helical region further comprises a seven 
residue periodicity and a coiled coil region having three breaks in a heptad repeat. 

10 

32. An isolated amino acid molecule comprising pneumococcal surface protein C, PspC, ofS. pneumoniae having 
alpha-helical, proline rich and repeat regions, wherein the alpha-helical region comprises a C-terminus having 
substantial homology with a protection-eliciting region of PspA. 

is 33. An isolated DNA molecule comprising a pneumococcal surface protein C gene of S. pneumoniae encoding 

the isolated amino acid molecule of paragraph 32. 

34. A PCR primer consisting essentially of the isolated DNA molecule of paragraph 33. 
20 35. A hybridization probe consisting essentially of the isolated DNA molecule of paragraph 33. 

36. An immunological composition comprising the amino acid molecule of paragraph 32. 

37. An isolated amino acid molecule of paragraph 24, further comprising a 1 7 amino acid, partially hydrophobic tail. 

25 

38. An isolated amino acid molecule of paragraph 32, further comprising a 1 7 amino acid, partially hydrophobic tail. 

39. An isolated amino acid molecule of paragraph 24, further comprising an epitope of interest. 
so 40. An isolated amino acid molecule of paragraph 32, further comprising an epitope of interest. 

41 . An immunological composition comprising the amino acid molecule of paragraph 39. 

42. An immunological composition comprising the amino acid molecule of paragraph 40. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(0 APPLICANT: Briles, David B. 
McDanfeI,LanyS. 
Swiatlo, Edwin 
Yother, Janet 
Grain, Marilyn J. 
Hoflrnfflheari, Susan 
Tart, Rebecca 
Brooks*- Walter, Alexis 

(ii) TITLE OF INVENTION: PNEUMOCOCCAL GENES, PORTIONS THEREOF, 
EXPRESSION PRODUCTS THEREFROM, AND USES OF SUCH GENES, 
PORTIONS AND PRODUCTS 

(lii) NUMBER OF SEQUENCES: 47 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Curtis, Morris & Safibrd, P.C. 

(B) STREET: 530 Fifth Avenue 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: U A 

(F) ZIP: 10036 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(VQ CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US OS/714,741 

(B) FILING DATE: 16-SEP1996 
(Q CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Frommflr Esq n William S. 

(B) REGISTRATION NUMBER: 25,506 

(C) REFERENCE/DOCKET NUMBER: 4543 12-2460 

Cot) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 840-3333 

(B) TELEFAX: (212) 84M712 



(2) INFORMATION FOR SEQ ID NO:l: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

00 MOLECULE TYPE: DNA (genomic) 
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(xO SEQUENCE DESCRIPTION: SEQ ID NO:1 : 
COOO ATCCAO CTCCTGCACC AAAAAC 
(2) INFORMATION FOR SEQ ID N02: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ti) MOLECULE TYPE: DNA (genomic) 



(xO SEQUENCE DESCRIPTION: SEQ ID N02: 
GCGCGTCOAC GGCTTAAACC jCATTCACCAT TOG 
(2) INFORMATION FOR SEQ ID NO:3: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic add 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

00 MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ EDN03: 
CCGG ATCCTG AGCCAGAGCA GTTGGCTG 
(2) INFORMATION FOR SEQ ID NO:4: 

(T) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
CCGGATCCGC TCAAAGAGAT TGATGAGTCT G 
(2) INFORMATION FOR SEQ ID NO:5: 
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(i) SEQUENCE CHARACTERISTICS: 
(A)l£NGTH:31basepaire 
O) TYPE: nucleic add 
(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

Cu) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
GOGGATCCCG TAGCCAGTCA GTCTAAAGCT G 
(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

fu) MOLECULE TYPE: DNA (fienomic) 



(xi) SEQUENCE DESCRIPTION: SEQ IDNO:6: 
CTG AGTCG AC TGGAGTTTCT GGAGCTGGAG C 
(2) INFORMATION FOR SEQ IDNO:7: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 
CCGGATCCAG CTCCAGCTCC AGAAACTCCA G 
(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(jd) SEQUENCE DESCRIPTION: SEQ ID N08: 
GCGGATCCTT GACCAATATT TACGO AGOAO GC 
(2) INFORMATION FOR SEQ ID NOS: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 20 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D)TOPOIX)GY: linear 

(ii) MOLECULE TYPE: DNA (gpnomk) 



(xQ SEQUENCE DESCRIPTION: SEQ ID NOtt 
GTTTTTGGTG CAGGAGCTGG 
(2) INFORMATION FOR SEQ ID NO; 10: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(u) MOLECULE TYPE: DNA (genomic) 



(jo) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCTATGGGCT ACAGGTTG 
(2) INFORMATION FOR SEQ ID NO:l 1: 

0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:i 1 : 
CCACCTGTAG CCATAGC 
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(2) INFORMATION FOR SEQ ID NO: 12: 

0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nncleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

00 MOLECULE TYPE: DNA (genomic) 



(xQ SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCGCATOCAO COTGCCTATC TTAGGGGCTO OTT 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



0d) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCAAGCTT AT G ATATAGAAA TTTGTAAC 
(2) INFORMATION FOR SEQ ID NO:14: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GG AAGGCCAT ATGCTCAAAG AG ATTGATG A GTCT 
(2) INFORMATION FOR SEQ ID NO:l5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



122 



EP 1 477 185 A2 



(ii) MOLECULE TYPE: DNA (genomic) 



(xO SEQUENCE DESCRIPTION: SEQ ID NO-.15: 
CCAAOGATCC TTAAACCCAT TCACCATTGG C 
(2) INFORMATION FOR SEQ ID NO. 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pain 

(B) TYPE: nuclek acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECUIH TYPE: DNA (genomic) 



0ri) SEQUENCE DESCRIPTION; SEQ ID NO: 16: 
CCGG ATCCGC TCAAAGAGAT TGATGAGTCT G 
(2) INFORMATION FOR SEQ ID NO: 17: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ffi> MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CTGAGTCGAC TGAGTTTCTG G AGCTGGAGC 
(2) INFORMATION FOR SEQ ID NO:I8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nuclek acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
GCGCGTCGAC GGCTTAAACC C ATTC ACCAT TOG 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(0 SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pans 
(D) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

00 MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CCGG ATCCAG CTCCTGCACC AAAAAC 
(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(n) MOLECULE TYPE: DNA (gcaomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
GCAAGCTTAT GATATAGAAA TTTGTAAC 
(2) INFORMATION FOR SEQ ID N021: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(p) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ DO NO:21: 
CCACATACCO TTTTCTTGTT TCCAGCC 
(2) INFORMATION FOR SEQ ID NO:22: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(li) MOLECULE TYPE: DMA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO-.22: 
CCGG ATCCAG CTCCTGCACC AAAAC 
(2) INFORMATION FOR SEQ ID NO:23: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2S base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: DNA (genomic) 



(x0 SEQUENCE DESCRIPTION: SEQ ID N023: 
COGGATCCTG AGCCAGAGCA GTTGGCTG 
(2) INFORMATION FOR SEQ ED N024: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic, acid 

(C) STRANDEDNESS: single 

(D) TOPOLCK3Y:bnear 

(i0 MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOS4: 
COGGATCCGC TCAAAGAG AT TGATG AGTCT G 
(2) INFORMATION FOR SEQ ID N025 : 

(i) SEQUENCE OlARACTERISTICS: 

(A) LENGTH. 31 base pairs 

(B) TYPE: nudeic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Knear 

(ii) MOLECULE TYPE; DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
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GCGOATCCCG TAGCCAGTCA GTCTA AAGCT G 
(2) INFORMATION FOR SEQ ID N026: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 



(xQ SEQUENCE DESCRIPTION: SEQ ID N026: 
CTGAGTCGAC TGGAGTTTCT GGAGCTGGAG C 
(2) INFORMATION FOR SEQ ID NOS7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(3d) SEQUENCE DESCRIPTION: SEQ ID N027: 
CCGGATCCAG CTCC AGCTCC AGAAACTCCA G 
(2) INFORMATION FOR SEQ ID N028: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic arid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xO SEQUENCE DESCRIPTION: SEQ ID NO:28: 
GTTTTTGGTG CAGG AGCTGG 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
00 MOLECULE TYPE: DNA (genomic) 



(x0 SEQUENCE DESCRIPTION: SEQ ID NCh29: 
GCTATGGCTA CAOOTTG 
(2) INFORMATION FOR SEQ ID NO30: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPB: nucleic arid 

(Q STRANDEDNESS: single 
0>)TOPOLOOY: linear 

(ti) MOLECULE TYPE: DNA (genomic) 



(xO SEQUENCE DESCRIPTION: SEQ ID NO30: 
CCGGATCCAG CGTOCCTATC TFAGGGGCTG GT 
(2) INFORMATION FOR SEQ ID N031: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOS1; 
GCGGATCCTT G ACCAATAAC GGAGGAGGC 
(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8991 amino acids 

(B) TYPE: amino arid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N032: 
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Met Asn Lya Lys Lys Met lie Leu Thr Ser Leu Ala Sex Val Ala He 
15 10 15 

Leu Gry Thr Gly Phe Val Ala Ser Pro Pro Thr I^u Val Arg Ala Gin 
20 25 30 

Ghi Ser Pro Gin Val Val Gto Lys Ser Ser Leu Gto Lys Lys Tyr Ghi 
35 40 45 

Ghi Ala Lys Ala Lys Ala Asp Thr Ala Lys Lys Asp Tyr Ghi Thr Ala 
50 55 60 

Lys Lys Lys Ala Ghi Asp Ala Gin Lys Lys Tyr Asp Ghi Asp Gin Lys 
65 70 75 80 

Lys Thr Gtu Asp Lys Ala Lys Ala Val Lys Lys Val Asp Ghi Ghi Afg 
85 90 95 

Gto Lys Ala Dc Leu Ala Val Gin Lys Ala Tyr Val Ghi Tyr Arg Ghi 
100 105 110 

Ala Lys Asp Lys Ala Ser Ala Ghi Lys Gin He Ala Ghi Ala Lys Arg 
115 120 125 

Lys Thr Met Asn Lys Lys Lys Met lie Leu Thr Ser Leu Ala Ser Val 
130 135 140 

Ala He Leu Gry Ala Gly Leo Val Thr Ala Ghi Pro Thr Leu Val Arg 
145 150 155 160 

Ala Ghi Ghi Ala Pro Val Ala Ser Gin Ser Lys Ala Glu Lys Asp Tyr 
165 170 175 

Asp Thr Ala Lys Arg Asp Ala Ghi Asn Ala Lys Lys Ala Leu Ghi Ghi 
180 185 190 

Ah Lys Arg Ak (Hn Lys Lys iy Ghi Asp Asp Gin Lys Lys Thr Gut 
195 200 205 

Ghi Lys Ala Lys Ghi Ghi Lys Gin Ala Ser Gto Ala Ghi Gto Lys Ala 
210 215 220 

Asn Leu Gh Tyr Gin Leu Lys Leu Arg Ghi iyr lie Gin Lys Thr Gly 
225 230 235 240 

Asp Arg Ser Lys lie Gto Thr Glu Met Glu Ghi Ala Ghi Lys Lys His 
245 250 255 

Lys Thr Ala Lys Ala Glu Phe Asp Lys Val Arg Gly Thr Val lie Pro 
260 265 270 

Ser Ala Ala Arg Val Met Asn Lys Lys Lys Met De Leu Thr Ser Leu 
275 280 285 

Ala Ser Val Ala lie Leu Gry Ala Gly Leu Val Thr Ser G In Pro Thr 
290 295 300 

Leu Val Arg Ala Glu Ghi Ala Pro Val Ala Ser Gin Ser Lys Ala Glu 
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305 310 315 320 

Lys Asp Tyr Asp Ala Ala Val Lys Lys Ser Ghi Ala Ala Lys Lys Ala 
325 330 335 

Tyr Ghi Glu Ala Lys Lys Lys Ala Glu Asp Ala Gin Lys Lys Tyt Asp 
340 345 350 

G hi Asp Ghi Lys Lys Hit Ghi Ghi Lys Ala Ghi Asa Ghi Lys Lys Ah 
355 360 365 

Ala Ala Asp Leu Tte Ghi Ala Thr Ghi Val His Gin Lys Ala Tyr Val 
370 375 380 

Aig tyr Ser Gry Ser Asn Glu Gin Lys De Lys Asn Phe Lys De Leo 
385 390 395 400 

Ala Ife Met Xaa Lys Lys Lys Met Ik Leu Thr Ser Leu Ala Ser Val 
405 410 415 

Ala lie Leu Gry Ala Gry Xaa Val Ala Ser Gin Pro Hit Xaa Val Arg 
420 425 430 

Ala Glu Asp Ala Pro Val Ala Asia Ghi Ser Ghi Ala Glu Lys Asp Tyr 
435 440 445 

Xaa Ala Ala Xaa Xaa Lys Ser Ghi Ala Ala Lys Lys Xaa Tyr Xaa Xaa 
450 455 460 

Ala Lys Lys Val Leu Ala Ghi Ala Ghi Ala Ala Gin Lys Xaa Xaa Glu 
465 470 475 480 

Asp Xaa Gtn Lys Lys Pro Ghi Ghi Lys Ala Glu Lys Ala Lys Ala Ala 
485 490 495 

Ser Ghi Ghi He Val Lys Ala Thr Glu Ghi Val Gin Xaa Ala Ala Met 
500 505 510 

Asn Lys Lys Lys Met lie Leu Thr Ser Leu Ala Ser Val Ala lie Leu 
515 520 525 

Gry Ala Gry Leu Val Ttir Ser Ghi Pro Thr Leu Val Arg Ala Ghi Ghi 
530 535 540 

Ala Pro Gry Ala Ser Gin Ser Lys Ala Ghi Lys Asp Tyr Xaa Ala Ala 
545 550 555 560 

Xaa Lys Lys Ser Ghi Ala Ala Lys Lys Ala Tyr Ghi Ghi Ala Lys Lys 
565 570 575 

Lys Ala Ghi Asp Ala Gin Lys Lys Tyr Asp Glu Gry Gin Lys Lys Thr 
580 585 590 

Glu Ghi Lys Ala Arg Lys Ala Glu Ghi Ala Ser Lys Ghi Leu Ala Lys 
595 600 605 

Ala Thr Ser Ghi Val Ghi Asn Ala Tyr Val Lys Tyr Gin Gry Val Gin 
610 615 620 
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Arg Asn Ser Arg Lea Asa Gb Lys Glu Arg Lys Lys Gb Leu Ala Gb 
625 630 635 640 

lie Asp Gtu Ghi He Asn Lys Ala Lys Gin lie Trp Asn Glu Lys Asn 
645 650 655 

Glu Asp Phe Lys Lys Val Aig Ghi Ghi Val De Pro Ghi Pro Thr Ghi 
660 665 670 

Leu Ala Lys Asp Gb Aig Lys Ala Ghi Glu Ala Lys Ala Ghi Ghi Lys 
675 680 685 

Val Ala Lys Arg Lys Tyr Asp Tyr Ala Thr Leu Lys Val Ala Leu Ala 
690 695 700 

Lys Ser Tyr Val Gb Ala Gb Gb Ala Xaa Leu Met Asn Lys Lys Lys 
705 710 715 720 

Met lie Leu Thr Ser Leu Ala Ser Val Ala Ik Leu Gly Ala Gly Leu 

725 730 735 

Val Thr Sex Gb Pro Thr Phe Val Arg Ala Gb Gb Ala Pro Val Ala 
740 745 750 

Ser Gb Pro Lys Ala Gb Lys Asp Tyr Asp Pro Ala Gly Lys Lys Ser 
755 760 765 

Gb Ala Ala Thr Lys Ala Tyr Gb Asp Ala Lys Pro Thr Ala Gb Asp 
770 775 7*0 

Ab Gb Lys Lys Tyr Asp Gtu Ala Gb Lys Lys Fro Asp Ala Gb Arg 
785 790 795 800 

Met Asn Lys Lys Lys Met We Leu Thr Ser Leu Ala Ser Val Ala Ik 
805 810 815 

Leu Gly Ala Gly Leu Val Ala Ser Gb Pro Thr Val Val Arg Ab Gb 
820 825 830 

Gb Ala Pro Val Ala Lys Gh Ser Gb Ah Gb Arg Asp Tyr Asp Ala 
835 840 845 

Ab Met Lys Lys Ser Gb Ala Ab Lys Lys Glu Tyr Gb Gb Ab Lys 
850 855 860 

Lys Asp Leu Gb Gb Ab Lys Ab Ab Gb Lys Lys Tyr Gly Gly Asp 
865 870 875 880 

Pro Lys Lys Thr Gly Gb Gb Thr Lys Leu Val Pro Lys Ah Asp Gly 
885 890 895 

Gb Arg Pro Lys Ab Asn Val Ab Val Pro Lys Ala Tyr Leu Lys Leu 
900 905 910 

Arg Gb Ab Gb Glu Gb Leu Asn Gb Ser Pro Asn Asn Lys Lys Asn 
915 920 925 

Ser Ala Gb Gb Lys Leu Lys Asp Ab Leu Ab His He Asp Gb Val 
930 935 940 
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Hit Leo Asa Gin Lys Glu Ala Obi Ala Met Asa Lys Lys Lys Met lk 
945 950 955 960 

Leu Thr Ser Leu Ala Ser Val Ala lb Leu Gry Ala Gry Leu Val TTir 
965 970 975 

Ser Ghi Pro Thr Val Val Aig Ala Glu Gin Ser Pro Val Ala Ser G!n 
980 985 990 

Ser Lys Ala Glu Lys Asp Tyr Asp Ala Ala Val Lys Asa Ala Hit Ala 
995 1000 1005 

Ala Lys Lys Ala Ala Ghi Asp Ala IDs Arg Ala Leu Asp Ghi Ala Lys 
1010 1015 1020 

Ala Ala Ghi Lys Asn Tyr Asp Ghi Asp Gb Lys Lys Pro Ghi Glu Lys 
1025 1030 1035 1040 

Ala Lys Ghi Val Pro Lys Ala Pro Ala Glu Ghi Met Asn Lys Lys Lys 
1045 1050 1055 

Met He Leu Thr Ser Leu Ala Ser Val Ala lie Leu Gry Aln Gly Leu 
1060 1065 1070 

Val Ala Ser Gin Pro Thr Leu Val Arg Ata Ghi Asp Ala Pro Val Ala 
1075 1080 1085 

Asn Gin Ser Gin Ala Glu Lys Asp Tyr Asp Ala Ala Met Lys Lys Ser 
1090 1095 1100 

Glu Ala Ala Lys Lys Ghi Tyt Glu Asp Ala Lys Lyi Val Leu Ala Glu 
1105 1110 1115 1120 

Ala Glu Ala Ala Gin Lys Lys Tyr Ghi Asp Asp Gin Lys Lys Thr Ghi 
1125 1130 1135 

Ghi Lys Ala Glu Asn Ala Asn Ala Ala Ser Glu Glu Be Ala Lys Ala 
1140 1145 1150 

Thr Glu Glu Val His Met Asn Lys Lys Lys Met lie Leu Thr Ser Leu 
1155 1160 1165 

Ala Ser Val Ah De Leu Gry Ala Gly Leu Val Ala Ser Ser Pro Tnr 
1170 1175 1180 

Val Val Arg Ala Gin Ghi Ala Pro Val Ala Ser Gin Ser Lys Ala Ghi 
1185 1190 1195 1200 

Lys Asp Tyr Asp Thr Ala Lys Arg Asp Ala Ghi Asn Ala Lys Lys Ala 
1205 1210 1215 

Leu Ghi Ghi Ata Lys Arg Ala Ghi Glu Lys Tyr Ala Asp Tyr Gtn Arg 
1220 1225 1230 

Arg lie Glu Glu Lys Ala Ala Lys Ghi Thr Gin Ala Ser Leu Glu Gin 
1235 1240 1245 

Gin Glu Ala Asn Lys Asp Tyr Gin Leu Lys Leu Lys Lys Tyr Leu Asp 
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1250 1255 1260 

Gry Aig Asa Leu Ser Asn Ser Ser Val Leu Lys Lys Glu Met Gui Ghi 
1265 1270 1275 1280 

Ala Ghi Lys Lys Asp Lys Glu Asn Gin Ala Ghi Phe Asa Lys lie Aig 
1285 1290 1295 

Arg Glu He ValVal Pro Asn Pro GtoGlo Leu Glu Met Ala Arg Arg 
1300 1305 1310 

Lys Ser Glu Val Val Lys Ala Thr Ghi Ser Gly Leu Val Thr Arg Val 
1315 1320 1325 

Glu Glu Ala Ghi Lys Asn Val Thr Asp Ala Afg Gin Lys Leu Val Leu 
1330 1335 1340 

Lys Cys Asn Glu Val Val Leu Gb Ala XaaXaa Ala Glu Leu Glu Ser 
1345 1350 1355 1360 

Gly Gry His Lys Leu Glu Pro Lys Met Asa Lys Lys Lys Met Ue Leu 
1365 1370 1375 

Thr Ser Leu Ala Ser XaaAkfle Leu Gry Ala Gry Leu Val Ala Ser 
1380 1385 1390 

Gin Pro Thr Val Val Arg Ala Ghi Glu Ak Pro Val Ala Ser Gin Ser 
1395 1400 1405 

Lys Ala Ghi Lys Asp Tyr Asp Ala Ala Lys Arg Asp Ala Ghi Asa Ala 
1410 1415 1420 

Lys Lys Ala Leu Ghi Glu Ala Lys Arg Ala Gin Lys Xaa Xaa Glu Asp 
1425 1430 1435 1440 

Asp Gin Lys Lys Thr Glu Ghi Lys Ala Lys Xaa Asp Xaa Ghi Ala Ser 
1445 1450 1455 

Ghi Ala Glu Ghi Lys Ala Asn Leu Xaa Tyr Gin Leu Leu Leu Gin Lys 
1460 1465 1470 

Tyr Val Ser Ghi Ser Asp Gry Lys Lys Lys Lys Glu Xaa Glu Xaa Xaa 
1475 1480 1485 

Ala Asp Ala Ala Lys Lys Glu lie Glu Leu Lys Xaa Ala Asp Leu Xaa 
1490 1495 1500 

Lys He Xaa Gla Ghi Met Asa Lys Lys Lys Met He Leu Thr Ser Leu 
1505 1510 1515 1520 

Ala Ser Val Ala He Lea Gly Ala Gly Leu Val Ala Ser Gin Pro Tnr 
1525 1530 1535 

Val Val Arg Ala Ghi Glu Ala Pro Val Ala Ser Gin Ser Lys Ala Glu 
1540 1545 1550 

Lys Asp Tyr Asp Ala Ala Val Glu Lys Ser Lys Ala Ala Glu Ghi Asp 
1555 1560 1565 
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Leu Glu Gtu Ala Ghi Ala Ala Gin Aig Lys Tyr Asp Ghi Asp Gin Lys 
1570 1575 1580 

Lys Ser Glu Glu Asn Ghi Lys Gtu Thr Glu Gtu Ala Ser Glu Arg Gin 
1585 1590 1595 1600 

Ghi Ala Ala Thr Leu Lys Tyr His Leu Glu Ser Xaa Olu Phe Leu Am 
1605 1610 1615 

Tyr Phe Gbi Asp Asn His Arg Met Asn Lys Lys Lys Met Be Leu TTxr 
1620 1625 1630 

SerLeuAlaSerValAlalkUuGryAlaGlyLeuValAlaSerPro 
1635 1640 1645 

Pro Thr Val Val Arg Ak Glu Ghi Ala Pro Val Ala Ser Gin Ser Lys 
1650 1655 1660 

Ah Ghi Lys Asp Asp Thr Ala Lys Arg Asp Ak Glu Asn Ala Lys 
1665 1670 1675 1680 

Lys Ala Leu Ghi Ghi AlaLys Arg Ala Gin Glu Lys Tyr Ala Asp Tyr 
1685 1690 1695 

Gh Arg Arg De Ght Gin Lys Ala Ala Lys Ghi Inr His Ala Ser Leu 
1700 1705 1710 

Ghi Gb Ghi Ghi Ala Am Lys Asp Tyr Gin Leu Lys Leu Lys Lys Tyr 
1715 1720 1725 

Leu Asp Gfy Arg Asn Leu Ser Asn Ser Ser Val Leu Lys Lys Glu Met 
mo 1735 1740 

Gin Ghi Ala Glu Lys Lys Asp Lys Glu Lys Pro Ala Glu Phe Asn Lys 
1745 1750 1755 1760 

lie Arg Arg Ghi He Val Val Pro Asn Pro Gin Ghi Leu Ghi Met Ala 
1765 1770 1775 

Arg Arg Lys Ser Ghi Val Ala Lys Thr Lys Glu Ser Gty Leu Val Lys 
1780 1785 1790 

Arg Val Glu Glu Ala Ghi Lys Lys Val Thr Glu Ala Arg Pio Lys Leu 
1795 1800 1805 

Asp Ala Glu Arg Ala Lys Glu Val Val Leu Gin Ala Gin lie Ala Met 
1810 1815 1820 

Asn Lys Lys Lys Met He Leu Thr Ser Leu Ala Ser Val Ala lie Leu 
1825 1830 1835 1840 

Gly Ala Gry Leu Val Ala Ser Pro Pro Thr Val Val Arg Ala Glu Glu 
1845 1850 1855 

Ala Pro Val Ala Ser Gin Ser Lys Ala Glu Lys Asp Tyr Asp Thr Ala 
1860 1865 1870 

Lys Arg Asp Ala Ghi Asn Ala Lys Lys Ala Leu Glu Ghi Ala Lys Arg 
1875 1880 1885 
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Ala Gin Ghi Lys Tyr Ah Asp Tyr Gin Arg Arg Dc Ghi Oh Lys Ala 
1890 1895 1900 

Ala Lys Ghi Thr Ms Ala Scr Leu Glu Ch Gin Ghi Ala Aan Lys Asp 
1905 1910 1915 1920 

Tyr Gin Lea Lys Leu Lys Lys Tyr Leu Asp Gly Arg Asn Leo Ser Asn 
192S 1930 1935 

Ser Ser VaJ Leo Lys Lys Ghi Met Ghi Ghi Ah Ghi Lys Lys Asp Lys 
1940 1945 1950 

Glu Lys Gin Ala Gly Leu Met Asn Lys Lys Lys Met He Leu Thr Ser 
1955 1960 1965 

Leu Ala Ser Val Ala He Leu Gly Ala Gly Leu Val Thr Ser Gin Pro 
1970 1975 1980 

Thr Leu Val Arg Ala Ghi Ghi Ser Pro Val Ala Ser Gin Ser Lys Ala 
1985 1990 1995 2000 

Ghi Lys Asp Tyr Asp Ala Ala Lys Arg Asp Ala Ghi Asn Ala Lys Lys 
2005 2010 2015 

Ala Leu Ghi Glu Ala Lys Arg Ala Gin Glu Lys Tyr Ala Asp Tyr Gin 
2020 2025 2030 

Arg Arg He Ghi Glu Lys Ala Ala Lys Glu Gin Gin Ala Ser Leu Ghi 
2035 2040 2045 

Gin Ghi Ghi Ala Asn Lys Asp Tyr Gin Leu Lys Leu Lys Lys Tyr Leu 
2050 2055 2060 

. Asp Gly Arg Asn Leu Ser Asn Ser Ser Val Leu Lys Lys Glu Met Ghi 
2065 2070 2075 2080 

Ghi Ala Glu Lys Lys Asp Lys Glu Lys Gh Ala Ghi Phe Asn Lys lie 
2085 2090 2095 

Arg Arg Ghi lie Val Val Pro Asn Pro Gin Ghi Leu Ghi Met Ala Arg 
2100 2105 2110 

Arg Lys Ser Ghi Val Val Lys Ala Lys Ghi Ser Gry Leu Val Lys Arg 
2115 2120 2125 

Val Glu Glu Ala Ghi Lys Lys Val Thr Glu Ala Arg Gin Lys Leu Asp 
2130 2135 2140 

Ala Ghi Arg Ala Lys Glu Val Val Leu Gin Pro Thr Arg Val Ghi Asn 
2145 2150 2155 2160 

Glu Val His Lys Leu Xaa Gh Lys Met Asn Lys Lys Lys Met De Leu 
2165 2170 2175 

Thr Ser Leu Ah Ser Val Ah De Leu Gry Ah Cry Leu Val Thr Ser 
2180 2185 2190 

Gh Pro Thr Phe Val Arg Ah Ghi Glu Ser Pro Gh Val Val Ghi Lys 
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2195 2200 2205 

Scr Ser Leu Ola Lys Lys Tyr Glu Ghi Ala Lys Ah Lya Ala Asp Thr 
2210 2215 2220 

Ala Lys Lys Asp Tyr Gtu Thr Ala Lys Lys Lys Ala Ohx Asp Ala Ola 
2225 2230 2235 2240 

Lys Lys Tyr Gtu Asp Asp Ob Lys Arg Thr Ola Ghi Lys Ala Arg Lys 
224£ 2250 2255 

Ghi Ala Gtu Ala Scr Gtu Lys Leu De Asp Val Ala Leu Val Val Gin 
2260 2265 2270 

Asn Ala Tyr Lys Glu T>r Arg Glu Val Gin Asn Gto Arg Ser Lys Tyr 
2275 2280 2285 

Lys Ser Asp Ala Asp Tyr (Hn Lys Lys Leu Hit Glu Val Asp Ser Lys 
2290 2295 2300 

He Glu Lys Ala Arg Lys Glu Ok Gin Asp Leu Otn Asn Asn Phe Asa 
2305 2310 2315 2320 

GhiValArgAkVaJValAlaProAspProThrCysVal Gry Xaa Asp 
2325 2330 2335 

Xaa Arg Met Asn Lys Lys Lys Met He Leu Thi Ser Leu Ala Ser Val 
2340 2345 2350 

A )a lie Leu Gry Ala Gry Xaa Val Thr Ser Gin Pro Thr Xaa Val Arg 
2355 2360 2365 

Ala Gtu Ghi Ala Pro Gin Val Val Glu Lys Ser Ser Leu Glu Lys Lys 
2370 2375 2380 

Tyr Glu Ghi Ala Lys Ala Lys Tyr Asp Ala Ala Lys Lys Asp Tyr Asp 
2385 2390 2395 2400 

Glu Ala Lys Lys Lys Ala Ala Glu Ala Gin Lys Lys Tyr Glu Ghi Asp 
2405 2410 2415 

Gh Lys Lys Thr Ghi Glu Lys Ala Ghi Lys Ala Lys Ala Ah Ser Ghi 
2420 2425 2430 

Ghi He Ala Lys Ala Thr Glu Ghi Val Ghi Lys Ala Val Leu Asp Tyr 
2435 2440 2445 

He Thr Ala De Arg Asn His Asn Asp Ser Gry Lys Thr Ser Ala Ghi 
2450 2455 2460 

Glu Ala Ghi Asn Lys Ala Lys Glu Arg Asp Tyr Cys Q» Ala Gry Lys 
2465 2470 2475 2480 

Lys Pbe Asp Pro He Gin Thr Pro Phe Val Ala Ser Leu Thr Gin Met 
2485 2490 2495 

He Leu Met Asn Lys Lys Lys Met He Leu Thr Ser Leu Ala Ser Val 
2500 2505 2510 



135 



EP1477185 A2 



Ala Dc Leu Gry Ala Gry Leu Val AJa Ser Scr Pro Hit Val Val Arg 
2515 2520 2525 

Ala Gb Gb Ab Pro Val Ala Ser Gb Ser Lys Ala Gb Lys Asp Tyr 
2530 2535 2540 

Asp Thr Ala Lys Arg Asp Ala Gb Asn Ala Lys Lys Ala Lea Ghi Gb 
2545 2550 2555 2560 

Ala Lys Arg Ala Gin Ghi Lys T^r Ala Asp Tyt Ghi Arg Aig lie Ghi 
2565 2570 2575 

Glu Lys Ala Ala Lys Ghi Thr Gin Ala Ser Leu Gtu Ghi Gb Ghi Ala 
2580 2585 2590 

Asn Lys Asp Tyr Gin Leu Lys Leu Lys Lys Tyr Leo Asp Gly Arg Asn 
2595 2600 2605 

Leu Ser Asn Scr Ser Val Leu Lys Lys Ghi Met Glu Ghi Ah Ghi Lys 
2610 2615 2620 

Lys Asp Lys Ghi Asn Ghi Ala Glu Phe Asn Lys lie Arg Arg Glu He 
2625 2630 2635 2640 

Val Val Pro Asn Pro Gin Ghi Leu Ghi Met Ala Met Asn Lys Lys Lys 
2645 2650 2655 

Met Be Leu Thr Ser Leu Ala Ser Val Ala lie Leu Gly Ala Gly Phe 
2660 2665 2670 

Val Ala Ser Gin Pro Thr Val Val Arg Ala Glu Ghi Ser Pro Val Ala 
2675 2680 2685 

Ser Gb Ser Lys AJa Glu Lys Asp Tyr Asp Ala Ala Lys Lys Asp Ala 
2690 2695 2700 

Lys Asa Ala Lys Lys Ala Val Gtu Asp Ala Gb Lys Ala Leu Asp Asp 
2705 2710 2715 2720 

Ala Lys Ala Ala Gin Lys Lys Tyr Asp Ghi Asp Gin Lys Lys Thr Ghi 
2725 2730 2735 

Ghi Lys Ala Ala Leu Ghi Lys Ala Ala Ser Ghi Ghi Met Asp Lys Ala 
2740 2745 2750 

Val Ala Ala Val Ghi Gb Ala Tyr Leu Ala Tyr Gin Gin Ala 11>r Asp 
2755 2760 2765 

Lys Ala Ala Lys Asp Ala Ala Asp Lys Met lie Asp Glu Ala Lys Lys 
2770 2775 2780 

Arg Glu Ghi Ghi Ak Lys Thr Lys Phe Asn Thr Val Arg Ala Met Val 
2785 2790 2795 2800 

Val Pro Glu Pro Glu Gb Leu Ala Ghi Thr Lys Lys Lys Ser Gb Ghi 
2805 2810 2815 

Ala Lys Gb Lys Ala Pro Glu Leu Thr Lys Lys Leu Gb Gb Ala Lys 
2820 2825 2830 
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Ala Lys Leu Glu Glu Ala Ohi Lys Lys Ala Thr Glu Ala Lys Gin Lys 
2835 2840 2845 

VaJ Asp Ala Met Asa Lys Lys Lys Met He Lea Thr Ser Leu Ala Scr 
2850 2855 2860 

Val Ala lie Leu Gly Ala Gry Leu Val Ala Ser Gin Pro Thr Leu Val 
2865 2870 2875 2880 

Arg Ala Ghi Ghi Ser Pro Val Ala Ser Gin Ser Lys Ala Ghi Lys Asp 
2885 2890 2895 

Tyr Asp Ala Ala Val Lys Lys Scr Glu Ah Ala Lys Lys Ala Tyr Ghi 
2900 2905 2910 

Gh Ah Lys Lys Ah Leu Glu Glu Ah Lys Val Ah Gin Lys Lys Ty 
2915 2920 2925 

Glu Asp Asp Gin Lys Lys Thr Ghi Ghi Lys Ala Ghi Leu Ghi Lys Ghi 
2930 2935 2940 

Ala Ser Ghi Ala Iks Ala Lys Ala Tin- Ghi Glu Val Gin Gin Ala Tyr 
2945 2950 2955 2960 

Leu Ah "tyr Gin Arg Ah Ser Asn Lys Ah Glu Ah Ah Lys Met lie 
2965 2970 2975 

GhiGluAhGlnAigArgGluAsnGluAhArgAhLysPheTIirTlir 
2980 2985 2990 

He Arg Thr Thr Met Val Val Pro Glu Pro Glu Gin Leu Ala Ghi Thr 
2995 3000 3005 

Lys Lys Lys Ala Glu Glu Ala Lys Ala Lys Glu Pro Lys Leu Ala Lys 
3010 3015 3020 

Lys Ah Ala Glu Ah Lys Ah Lys Leu Glu Glu Ah Glu Lys Lys Ah 
3025 3030 3035 3040 

Hir Ghi Ah Asn Pro Gin Val Asp Ah Met Asa Lys Lys Lys Met lie 
3045 3050 3055 

Leu Tbr Ser Leu Ah Ser Val Ah He Leu Gly Ah G ry Phe Val Ah 
3060 3065 3070 

Set Ser Pro Ibr Phe Val Arg Ah Ghi Gh Ah Pro Val Ah Asn Gin 
3075 3080 3085 

Scr Lys Ah Glu Lys Asp Tyr Asp Ah Ah Val Lys Lys Ser Ghi Ah 
3090 3095 3100 

Ah Lys Lys Asp Tyr Glu Thr Ah Lys Lys Lys Ah Glu Asp Ah Gh 
3105 3110 3115 3120 

Lys Lys Tyr Asp Ghi Asp Gin Lys Lys Thr Ghi Ah Lys Ah Glu Lys 
3125 3130 3135 

Gh Arg Lys Ah Ser Ghi Lys De Ah Gh Ah Thr Lys Ghi Val Gin 
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3140 3145 3150 

Gin Ala Tyr Leu Ala Tyr Leu Gin Ala Scr Asa Ghi Ser Ghi Aig Lys 
3155 3160 3165 

Ghi Ala Asp Lys Lys lie Lys Gta Ala Thr His Ala Lys Met Arg Arg 
3170 3175 3180 

Thr Cys Asn Leu Thr He Ghi Phe Ghi Gin G to Leu Tyr Phe Leu Asn 
3185 3190 3195 3200 

Gin Val Ser Tyr Leu Arg Leu Arg Lys Lys Gin Lys Arg Gin Gin Lys 
3205 3210 3215 

Lys Gin Lys Tyr Leu Arg Lys Asn Leu Lys Arg Gin Leu Lys Arg Tyr 
3220 3225 3230 

Lys Tyr Arg Lys De Lys Tyr Leu Asn Lys Met Leu Lys Thr Lys Arg 
3235 3240 3245 

Lys Leu Met Asn Lys Lys Lys Leu De Val Thr Ser Leu Ala Ser Val 
3250 3255 3260 

Ala De Leu G fry Ala Asp Ser Val Thr Ser Pro Pro Ala Leu Val Arg 
3265 3270 3275 3280 

Ala Asp Gin Ala Ser Leu lie Ala Ser Gla Ser Lys Ala Ghi Lys Asp 
3285 3290 3295 

1>r Asp Ala Ala Lys Lys Asp Ala Lys Asn Ala Lys Lys Ala Val Ghi 
3300 3305 3310 

Asp Ala Gin Lys Ala Leu Asp Asp Ala Lys Ala Ala Gin Lys Lys Tyr 
3315 3320 3325 

Asp Ghi Asp Gin Lys Lys Thr Ghi Lys Lys Ala Ala Ala Val Lys Lys 
3330 3335 3340 

lie Asp Glu Glu His Gin Ala Ala Asn Leu Lys Ser Gin Gin Ala Leu 
3345 3350 3355 3360 

Val Ghi Phe Leu Ala Ala Gin Arg Ghi Gfry Asn Pro Lys Lys Lys Lys 
3365 3370 3375 

Ala Ala Gin Ala Thr Leu Ghi Ghi Ala Ghi Asn Ala Glu Lys Ghi Thr 
3380 3385 3390 

Lys Met Asn Lys Lys Lys Met He Lys Thr Ser Leu Ala Ser Ala Ala 
3395 3400 3405 

De Phe Gfry Ala Xaa Ser Glu Thr Ser Gta Pro Thr Arg Val Arg Pro 
3410 3415 3420 

Val Ghi Ala Pro Ghi Ala Arg His Pro Lys Val Asp Lys Tyr Tyr Asp 
3425 3430 3435 3440 

Ala Ghi Ala Asp Ghi Tyr Met Asn Lys Lys Lys Met De Leu Thr Ser 
3445 3450 3455 



138 



EP1477185A2 



UaiAlaSerValAlalleLeuGlyAlaGlyPheGlyCysValSerAb 
3460 3465 3470 

Tyr Ser Cys Lys Ser Aig Arg He Ser Arg Ser Ser Ala Ser Ser Gin 
3475 3480 3485 

Arg Leu Met Aan Lys Lys Lys Met Be Leu Lys Ser Leu Ala Ser Ala 
3450 3495 3500 

Ala Dc Ser Gry Ala Xaa Leu Val Xaa Pro Gin Pro Thr Leu Val Arg 
3505 3510 3515 3520 

Ala Ghi Ghi Ser Pro Ala Ala Ser Gin Ser His Pro Ghi Gin Asp Tyr 
3525 3530 3535 

Asp Xaa Xaa Xaa Xaa Leu Cys Xaa Xaa Leu Xaa His Gin Pro Ser Xaa 
3540 3545 3550 

Gry Arg Thr Leu Leu Xaa Xaa Xaa Xaa Ser Xaa Pro Xaa Ser Pro Thr 
3555 3560 3565 

Pro Xaa Xaa Xaa Xaa Xaa Xaa Pro Xaa Ser Xaa Leu Tiff 
3570 3575 3580 

Pro Leu Xaa Xaa Xaa Leu Lys Pro Phe Pro Leu Pro Xaa Ser Xaa Pro 
3585 3590 3595 3600 

Xaa Pro Pro Xaa Pro Pro Xaa Ser Pro Pro Ser Pro Pro Pro Arg Pro 
3605 3610 3615 

Xaa Leu Tyr Xaa Xaa Pro Pro Xaa Pro Xaa Pro Xaa Leu Ser Leu Xaa 
3620 3625 3630 

Leu De Pro Phe Leu Leu Leu Xaa Leu Pro Pro Pro Xaa Xaa Xaa Leu 
3635 3640 3645 

Pro His Leu Xaa Ser Pro Pro Xaa Pro Xaa Leu Pro Pro Ser Pro Thr 
3650 3655 3660 

Pro Xaa Leu Lys Ghi De Asp Glu Ser Asp Ser Glu Asp Tyr Leu Lys 
3665 3670 3675 3680 

Ghi Gry Leu Arg Ala Pro Leu Gin Ser Lys Leu Asp Thr Lys Lys Ah 
3685 3690 3695 

Lys Leu Ser Lys Leu Glu Glu Leu Ser Asp Lys De Asp Ghi Leu Asp 
3700 3705 3710 

Ala Glu He Ala Lys Leu Ghi Val Gin Leu Lys Asp Ate Ghi Gry Asa 
3715 3720 3725 

Asn Asn Val Ghi Ala Tyr Phe Lys Ghi Gry Leu Ghi Lys Thr Thr Ala 
3730 3735 3740 

Ghi Lys Lys Ala Ghi Leu Ghi Lys Ala Ghi Ala Asp Leu Lys Lys Ala 
3745 3750 3755 3760 

Val Asp Ghi Pro Glu Thr Pro Ala Pro Ala Pro Gin Pro Ala Pro Ala 
3765 3770 3775 
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ProGhiLysProAlaGbiLysProAlaProAlaProAhProGhiLys 
3780 3785 3790 

Pro Ato Pro Ala Pro Gto Lys Pro Ala Gta Lys Pro Ala Glu Lys Pro 
3795 3800 3805 

Ala Glu Ghi Pro Ala Gin Lys Pro Ala Pro Ala Pro Ghi Lys Pro Ala 
3810 3815 3820 

Pro Thr Pro Glu Lys Pro Ala Pro Thr ProGhi Thr Pro 
3825 3830 3835 3840 

Trp Lys Gin Glu Asn Gly Met Val Leu Asp Xaa Thr Dc Ala Glu Gry 
3845 3850 3855 

Lys Ala Gly lie Ala Ala Xaa Pro Pro Am lie Asp Lys Thr Pro Lys 
3860 3865 3870 

Asp Leu Ghi Asp Ser Gly Leo Gly Leu Gin Lys Val Leu Ala Hit Leu 
3875 3880 3885 

Asp Pro Gly Gly Ghi Thr Pro Asp Gly Leu Asp Lys Ghi Ala Ser Ghi 
3890 3895 3900 

Asp Ser Asa Oe Gly Ala Leu Pro Asn Gin Val S«r Asp Leu Glu Asn 
3905 3910 3915 3920 

Gin Val Ser Ghi Leu Asp Arg Ghi Val Thr Arg Leu Pro Ser Asp Leu 
3925 3930 3935 

Lys Asp Thr Ghi Gly Asn Asn Val Gly Asp Tyr Vol Lys Gly Gry Leu 
3940 3945 3950 

Glu Lys Ala Leu Thr Asp Glu Lys Val Gly Leu Asn Asn Thr Pro Lys 
3955 3960 3965 

Ala Leu Asp Thr Ala Pro Lys Ala Leu Asp Thr Ala Leu Asn Glu Leu 
3970 3975 3980 

Gly Pro Asp Gry Asp Ghi Ghi Glu Thr Pro Ala Pro Ala Pro Lys Pro 
3985 3990 3995 4000 

Ghi Ghi Pro Ala Ghi Gin Pro Lys Pro Ala Pro Ala Pro Lys Pro Ghi 
4005 4010 4015 

Lys Thr Asp Asp Gin Gin Ala Glu Ghi Asp Tyr Ala Arg Arg Ser Glu 
4020 4025 4030 

Ghi Ghi Tyr Am Arg Leo Pro Gin Gin Gin Pro Pro Lys Ala Ghi Lys 
4035 4040 4045 

Pro Ala Pro Ala Pro Lys Pro Glu Ghi Pro Val Pro Ala Pro Gly Gry 
4050 4055 4060 

Trp Ser Trp Arg He Leu Leu Ala Arg Pro Asp Arg Leu Ala Ala Arg 
4065 4070 4075 4080 

Gta Ala Glu Leu Ala Ghi Lys Gin Thr Ghi Leu Gry Lys Leu Leu Asp 
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4085 4090 4095 

Ser Leu Asp Pro Glu Gly Lys Thr Qb) Asp Glu Leu Asp Lys Glu Ala 
4100 4105 4110 

Gly Glu Ala Glu Leu Asp Lys Lys Ala Asp Gly Leu Pro Asn Lys Val 
4115 4120 4125 

Ser Asp Leu Glu Lys Glu De Ser Asn Leu Gru He Leu Leu Gry Gry 
4130 4135 4140 

Ala Asp Ser Glu Asp Asp Thr Ala Ala Leu Pro Asa Lys Leu Ala Thr 
4145 4150 4155 4160 

Lys Lys Ala Glu Leu Glu Lys Thr GM Lys Ghi Leu Asp Ala Ala Leu 
4165 4170 4175 

Asn Ghi Leu Gry Pro Asp Gry Asp Glu Ghi Glu Thr Pro Ala Pro Ala 
4180 4185 4190 

Pro Gin Pro Glu Gin Pro Ala Pro Ala Pro Lys Pro Ghi Gtn Pro Thr 
4195 4200 4205 

Pro Ala Pro Lys Pro Ghi Ghi Pro Thr Pro Ala Pro Lys Pro Glu Gin 
4210 4215 4220 

Pro Ala Pro Ala Pro Lys Pro Glu Gin Pro Ala Pro Ala Pro Lys Pro 
4225 4230 4235 4240 

Glu Ghi Pro Ala Pro Ala Pro Lys Pro Gta Gin Pro Thr Pro Gry Pro 
4245 4250 4255 

Lys De Ghi Ghi Leu Leu Leo Leu Glu Lys Ala Gry Leu Gly Lys Ala 
4260 4265 4270 

Gry Ala Asp Lew Lys Ghi Ala Val Asn Ghi Pro Gry Glu Ser Ala Gry 
4275 4280 4285 

Glu Pro Ser Ghi Pro Ghi Glu Pro Ala Glu Ghi Ala Pro Ala Pro Glu 
4290 4295 4300 

Gin Pro Thr Ghi Pro Thr Gin Pro Ghi Ghi Pro Ala Gly Ghi Thr Pro 
4305 4310 4315 4320 

Ala Pro Lys Pro Glu Lys Pro Ala Gry Gin Pro Lys Ala Glu Lys Hit 
4325 4330 4335 

Asp Asp Gin Ghi Ala Ghi Ghi Asp Tyr Ala ArgArg Ser Ghi Ghi Ghi 
4340 4345 4350 

Tyr Asn Arg Leu Thr Gin Gin Gtn Pro Pro Lys Ala Glu Lys Pro Ala 
4355 4360 4365 

Pro Ala Pro Gtn Pro Ghi Gtn Pro Ala Pro Ala Pro Lys Leu Lys Ghi 
4370 4375 4380 

lie Asp Glu Ser Asp Ser Ghi Asp Tyr Val Lys Ghi Gry Leu Arg Val 
4385 4390 4395 4400 
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Pro Leu Gb Ser Gh Lea Asp Val Lys Gb Ah Lys Leu Leu Lys Leu 
4405 4410 4415 

Gh Ghi Leu Ser Asp Lys lie Asp Glu Leu Asp Ala Ghi lie Ala Lys 
4420 4425 4430 

Asn Leu Lys Lys Asp Val Glu Asp Phe Gin Asn Ser Gry Gry Giy Tyr 
4435 4440 4445 

Ser Ah Leu Tyr Leu Ghj Ala Ala Ghi Lys Asp Leu Val Ala Lys Lys 
4450 4455 4460 

Ah Ghi Leu Ghi Lys Thr Ghi Ala Asp Leu Lys Lys Ah Val Asn Ghi 
4465 4470 4475 4480 

Pro Ghi Lys Pro Ala Ghi Ghi Pro Glu Asn Pro Ala Pro Ala Pro Lys 
4485 4490 4495 

Pro Ah Pro Ala Pro Ghi Pro Glu Lys Pro Ala Pro Ala Pro Ala Pro 
4500 4505 4510 

Lys Pro Ghi Lys Ser Ah Asp Gb Gin Ala Ghi Ghi Asp Tyr Ala Arg 
4515 4520 4525 

Arg Ser Glu Glu Glu Tyr Asn Arg Leu Thr G to Gb Gb Pro Pro Lys 
4530 4535 4540 

Ala Ghi Lys Pro Ala Pro Ala Pro Val Pro Lys Pro Gb Gb Pro Ala 
4545 4550 4555 4560 

Pro Ala Pro Lys Ser Arg Val Xaa Leu Asp Arg Gly Pro Ala Glu Ala 
4565 4570 4575 

Ala Val Lys Gb Gin Val Asp Ser Pro Pro Gb Gb Leu Ala Asp Val 
4580 4585 4590 

LysGb lie Ser Thr Arg Gry Lys Phe Leu Gry Gly Ala Ala Thr Glu 
4595 4600 4605 

Asp Glu Thr Ser Ala Lea Pro Asn Lys He Thr Ah Lys Gb AlaGb 
4610 4615 4620 

Leu Ah Lys Lys Gb Thr Gb Leu Glu Lys Leu Leu Asp Asn Leu Asp 
4625 4630 4635 4640 

Pro Glu Gry Lys Thr Gb Asp Gb Leu Asp Lys Gb Ah Ah Ghi Ah 
4645 4650 4655 

Gb Leu Asp Lys Lys Ah Asp Gb Leu Pro Asn Lys Val Ah Asp Leu 
4660 4665 4670 

Glu Lys Glu He Ser Asn Leu GhUe Leu Leu Gry Gry AJa Asp Pro 
4675 4680 4685 

G b Asp Asp Thr Ah Ah Leu Pro Asn Lys Leu Ah Thr Lys Lys Ah 
4690 4695 4700 

Gb Phe Gb Lys Thr Pro Lys G lu Leu Asp Ah Ah Leu Asn G b Leu 
4705 4710 4715 4720 
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Gly Pro Asp Gly Asp Ghi Gru Glu Thr Pro Ala Pro Ah Pro Ala Pro 
4725 4730 4735 

LysProGhi Gin Pro Ala Pro Ala Pro Ala Pro Lys Pro Glu Gin Pro 
4740 4745 4750 

Ala Pro Ala Pro Ala Pro Lys Pro Ghi Gin Pro Ala Pro Ala Pro Ala 
4755 4760 4765 

Pro Lys Pro Ghi Gin Pro Thr Pro Ala Pro Lys Len Lys Gta lie Asp 
4770 4775 4780 

Ghi Ser Asp Ser Ghi Asp T>r Be Lys Glu Gry Leo Arg Ala Pro Leu 
4785 4790 4795 4800 

Gin Ser Lys Leu Asp Ala Lys Lys Ala Lys Leu Ser Lys Leo Asp Glu 
4805 4810 4815 

Leu Ser Asp Lys lie Asp G lu Leu Asp Ala Glu He Ala Lys Leu Ghi 
4820 4825 4830 

Lys Asp Val Gly Asp Phc Pro Asn Ser Asp Gry Glu Gin Ala Gly Gin 
4835 4840 4845 

Tyr Leu Val Ala Ala Glu Lys Asp Leu Asp Ala Lys Glu Ala Glu Leu 
4850 4855 4860 

Gry Asn Thr Gry Ala Asp Leu Lys Lys Ala Val Asp Ghi Pro Ghi Thr 
4865 4870 4875 4880 

Pro Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Ala Pro Thr 
4885 4890 4895 

Pro Ghi Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Lys Pro 
4900 4905 4910 

Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro 
4915 4920 4925 

Lys Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Lys Pro Ghi Arg 
4930 4935 4940 

Thr Glu Asn Asp Gry Val Ghi Arg Thr Arg Lys Arg Ala Pro Lys Arg 
4945 4950 4955 4960 

He Met Ser Leu Ser Gin Lys Val Xaa Leu Lys Xaa Val Cys Arg Ala 
4965 4970 4975 

Pro Leu Gin Ser Lys Leu Asp Ala Gin Lys Ala Ghi Leu Leu Lys Leu 
4980 4985 4990 

Glu Glu Leu Ser Gry Lys He Ghi Ghi Leu Asp Ala Ghi He Ala Glu 
4995 5000 5005 

Leu Glu Val Gin Leu Lys Asp Ala Ghi Gry Asn Asn Asn Val Glu Ala 
5010 5015 5020 

Tyr Phe Lys Ghi Gry Leu Glu Lys Thr Thr Ala Ghi Lys Lys Ala Ghi 
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5025 5030 5035 5040 

Leu Glii Xaa Ala Xaa Ala Asp Leu Lys Lys Ala Val Asp Gto Pro Olu 
5045 5050 5055 

Tin- Pro Ala Pro Ala Pro Ala Pro Ala Fro Ala Pro Ala Pro Ala Pro 
5060 5065 5070 

Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro 
5075 5080 5085 

Lys Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro 
5090 5095 5100 

Ala Pro Lys Pro Ala Pro Ala Pro Ala Pro Ala Pro Lys Pro Gto Lys 
5105 5110 5115 5120 

Pro Ala Gto Lys Pro Ala Pro Ala Pro Lys Pro Gto Thr Xaa Lys Thr 
5125 5130 5135 

Tyr Gry Leu Lys Olu He Asp Ghi Ser Asp Sot Gto Asp Val Arg 
5140 5145 5150 

Gto Gry Phc Arg Ala Pro Leu Gto Ser Gto Leo Asp Ala Lys Gin Ala 
5155 5160 5165 

Lys Leu Ser Lys Leu Glu Gto Leu Ser Asp Lys He Asp Ghi Leu Asp 
5170 5175 5180 

Ala Gto lie Ala Lys Leu Gto Lys Asp Val Glu Asp Phe Gto Asn Ser 
5185 5190 5195 5200 

Asp Gry Gto Gto Ala Gry Gto Tyr Lea Ala Ala Ala Gry Gto Asp Leu 
5205 5210 5215 

He Ala Lys Lys Ala Gto Leu Gto Lys Ala Gto Ala Asp Leu Lys Lys 
5220 5225 5230 

Ala Val Asp Glu Pro Gto Thr Pro Ala Pro Ala Pro Ala Pro Ala Pro 
5235 5240 5245 

Ala Pro Ala Pro Thr Pro Gto Ala Pro Ala Pro Ala Pro Ala Pro Ala 
5250 5255 5260 

Pro Lys Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Lys Pro Ala 
5265 5270 5275 5280 

Pro Ala Pro Lys Pro Ala Pro Ah Pro Lys Pro Ala Pro Ala Pro Lys 
5285 5290 5295 

Pro Ato Pro Ato Pro Ala Pro Ala Pro Lys Pro Gto Lys Pro Ala Gto 
5300 5305 5310 

Lys Pro Ala Pro Ala Pro Lys Pro Gto Leu Lys Glu lie Asp Gto Ser 
5315 5320 5325 

Asp Ser Gto Asp iy Val Lys Gto Gry Phe Arg Ala Pro Leu Gto Ser 
5330 5335 5340 
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Ghi Leu Asp Ala Lys Gin Ala Lys Leu Ser Lys Leu Ghi GluLeu Ser 
5343 5350 5355 5360 

Asp Lys He Asp Ghi Leu Asp Ala Glu lie Ala Lys Leu Ghi Asp Gin 
5365 5370 5375 

Leu Lys Ala Ah Ghi Ghi Asn Asa Asa Val Glu Asp Tyr Phe Lys Glu 
5380 5385 5390 

Gly Leu Ghi Lys Thr De Ala Ala Lys Lys Ala Gta Leu Gla Lys Ttar 
5395 5400 5405 

Ghi Ala Asp Leu Lys Lys Ala Val Asn Ghi Pro Glu Lys Pro Ala Ghi 
5410 5415 5420 

Ghi Pro Ser Gin Pro Ghi Lys Pro Ala Ghi Ghi Ala Pro Ala Pro Ghi 
5425 5430 5435 5440 

Gh Pro Thr Ghi Pro Thr Ghi Pro Ghi Lys Pro Ala Ghi Ghi Pro Gin 
5445 5450 5455 

Pro Ala Pro Ala Pro Gh Pro Ghi Lys Pro Ala Ghi Ghi Thr Pro Ala 
5460 5465 ^ 5470 

Pro Lys Pro Glu Lys Pro Ala Ghi Gin Pro Lys Ala Ohi Lys Pro Ala 
5475 5480 5485 

Asp Gto Gin Ala Glu Ghi Asp Tyr Ala Arg Arg Ser Glu Glu Ghi Tyr 
5490 5495 5500 

Asn Arg Lea Thr Gto Gto Gta Pro Pro Lys Ala Ghi Lys Pro Ala Pro 
5505 5510 5515 5520 

Ala Pro Lys Thr Lys Gry Gly Ser Ala Leu Asp Gin Ghi Ala Ala Ala 
5525 5530 5535 

Pro Pro His Gin Val Ala Asp Leu Ghi Lys Gta lie Thr Gry Pro Ghi 
5540 5545 5550 

UcPhe Leu Gly Ghy Ala Asp Pro Glu Ala Asp lie Ala Ala Arg Pro 
5555 5560 5565 

Asa Ghi Leu Ala Ala Lys Gta Ala Ghi Leu Ala Gta Lys Pro Thr Gry 
5570 5575 5580 

Leu Ghi Lys Leu Leu Asp Ser Leu Asp Pro Ghy Gry Lys Thr Gta Asp 
5585 5590 5595 5600 

Ghi Leu Asp Lys Ghi Ala Gly Gh) Ala Ghi Leu Asp Lys Lys Ala Asp 
5605 5630 5615 

Ghi Leu Pro Asn Lys Val Ala Asp Leu Ghi Lys Ghi De Ser Asn Leu 
5620 5625 5630 

Glu De Leu Leu Gry Gly Ala Asp Ser Glu Asp Asp Thr Ala Ala Leu 
5635 5640 5645 

Pro Asn Lys Leu Ala Xaa Lys Xaa Ala Ghi Leu Glu Lys Thr Gta Lys 
5650 5655 5660 
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Olu Leu Asp Ala Ala Pro Asn Glu Leo Gfy Pro Asp Gry Asp Giu Glu 
5665 5670 5675 5680 

Glu llnr Pro Ala Pro Ala Pro Gin Pro Glu Gin Pro Ala Pro Ala Pro 
5685 5690 5695 

Lys Pro Ghi Gin Pro Ala Pro Ala Pro Lys Pro Gin Gin Pro Ala Pro 
5700 5705 5710 

Ah Pro Lys Pro Ghi Gin Pro Ala Pro Ala Pro Lys Pro Ghi Gin Pro 
5715 5720 5725 

Ala Pro Ala Pro Lys Pro Glu Gin Pro Ala Lys Pro Ghi Lys Pro Ala 
5730 5735 5740 

Ghi Ghi Pro Tbr Ghi Pro Giu Lys Pro Ala Hi Pro Lys Ttar Arg Val 
5745 5750 5755 5760 

Arg Ala Leu Lys Val Ala Ghi Phe Gry Val Gin Leu Arg Asp Ala Gly 
5765 5770 5775 

Gly Ser Asn Asn Val Gly Ala Tyr Phe Lys Ghi Gry Leu Ghi Ghi Thr 
5780 5785 5790 

Thr Ab Glu Xaa Glu Ala Gry Leu Gly Lys Ala Ghi Ala Asp Leu Lys 
5795 5800 5805 

LysAk Val Asp Glu Pro Ghi Thr Pro AU Pro Ala Pro Ala Pro Ala 
5810 5815 5820 

Pro Ala Pro Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala Pro Lys 
5825 5830 5835 5840 

Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Lys Pro Ala Pro Ala 
5845 5850 5855 

Pro Lys Pro Ah Pro Ala Pro Ala Pro Ala Pro Lys Pro Ghi Lys Pro 
5860 5865 5870 

Ala Glu Lys Pro Ala Pro Ak Pro Lys Pro Ghi Thr Pro Lys Thr Leu 
5875 5880 5885 

Lys Asp lie Asp Ghi Ser Asp Ser Ghi Asp Tyr Ala Lys Ghi Gry Leu 
5890 5895 5900 

Arg Ala Pro Leu Ghi Ser Glu Leu Asp Thr Lys Lys Ala Lys Leu Leu 
5905 5910 5915 5920 

Lys Leu Glu Glu Leu Ser Gry Lys He Ghi Ghi Leu Asp Ala Glu He 
5925 5930 5935 

Xaa Ghi Leu Ghi Val Gin Leu Lys Asp Ala Glu Gry Asn Asn Asn Val 
5940 5945 5950 

Ghi Ala Tyr Phe Lys Glu Gry Leu Glu Lys Thr Thr Ala Ghi Lys Lys 
5955 5960 5965 

Ala Ghi Leu Ghi Lys Ala Ghi Ala Asp Leu Lys Lys Ala Val Asp Glu 
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5970 5975 5980 

Pro Ghi Tin* Pro Ala Pro Ah Pro Ala Pro Ala Pro Ala Pro Ala Pro 
5985 5990 5995 6000 

Thr Pro Glu Ah Pro Ala Pro Ala Pro Ah Pro Lys Pro Ah Pro Ala 
6005 6010 6015 

ProLysProAhProAhProLysProAhProAhProLysProAh 
6020 6025 6030 

Pro Ah Pro Lys Pro Ah Pro Ah Pro Lys Pro Ah Pro Ah Pro Ala 
6035 6040 6045 

Pro Ah Pro Ah Pro Lys Pro Ah Pro Ah Pro Ah Pro Ah Pro Ah 
6050 6055 6060 

Pro Lys Pro Glu Lys Pro Ah Ghi Lys Pro Ah Pro Ah Pro Lys Pro 
6065 6070 6075 6080 

Ghi Thr Pro Lys Tin Gry Tip Lys Gin Glu Asa Gry Met Lea Lys Ghi 
6085 6090 6095 

De Asp Ghi Ser Asp Ser Gh Asp 7y Val Lys Gh Gry FfceAig Ah 
6100 6105 6)10 

Pro Leu Gin Ser Ghi Leu Asp Ah Lys Gin Ah Lys Leu Ser Lys Leo 
6115 6120 6125 

Ghi Glu Xaa Ser Asp Lys Xaa Asp Ghi Leu Asp Ah Ghi De Ah Lys 
6130 6135 6140 

Leo Ghi Lys Asp Val Gta Asp Phe Lys Asn Ser Asp Gry Glu Gh Ah 
6145 6150 6155 6160 

Gry Ghi Tyr Leu Ah Ah Ah Glu Glu Asp Leu He Ah Lys Lys Ah 
6165 6170 6175 

Xaa Leu Ghi Lys Ah Ghi Ah Asp Leu Lys Lys Ah Val Asp Ghi Pro 
6180 6185 6190 

Ghi Thr Pro Ah Pro Ah Pro Ala Pro Ala Pro Ah Pro Ah Pro Thr 
6195 6200 6205 

Pro Ghi Ah Pro Ah Pro Ah Pro Ah Pro Ah Pro Lys Pro Ah Pro 
6210 6215 6220 

Ah Pro Lys Pro Ah Pro Ah Pro Lys Pro Ah Pro Ah Pro Lys Pro 
6225 6230 6235 6240 

Ah Pro Ah Pro Lys Pro Ah Pro Ah Pro Ah Pro Ah Pro Lys Pro 
6245 6250 6255 

Gh Lys Pro Ah Ah Leu Lys Glu De Asp Ghi Ser Asp Val Glu Val 
6260 6265 6270 

Lys Lys Ah Ghi Leu Gh Leu Val Lys Glu Ghi Ah Lys Ghi Pro Arg 
6275 6280 6285 
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Asn Glu Ghi Lys Val Lys Gh Ah Lys Ala Ghi Val Ghi Ser Lys Lys 
6290 6295 6300 

Ala Ghi Ah Thr Arg Leu Ghi Lys lie Lys Tfar Asp Arg Lys Lys Ala 
6305 6310 6315 6320 

Ghi Glu Ala Lys Arg Lys Ala Ala Ghi Ghi Asp Lys Val Lys Glu Lys 
6325 6330 6335 

Pro Ah Pn> Lys Pro G hi Asn Pro Ah Ghi Gtn Pro Lys Ah Glu Lys 
6340 6345 6350 

Pro Ah Asp Gin Gin Ah Ghi Glu Asp Tyr Ah Arg Arg Ser Glu Ghi 
6355 6360 6365 

Ghi Tyr Xaa Arg Lea Tfar Gtn Gin Gin Pro Pro Lys Thr Gh Lys Pro 
6370 6375 6380 

Ah Gh Pro Ser Tfar Pro Lys Hit Lys Gry Ghi Ah Arg Ghi Ser Arg 
6385 6390 6395 6400 

Xaa Glu Glu Lys Val Asn Gin Pro Lys Xaa Ghi Val Glu Ser Lys Lys 
6405 6410 6415 

Xaa Ghi Ah Thr Arg Leu Glu Lys He Lys Thr Asp Arg Lys Lys Ah 
6420 6425 6430 

Ghi Ghi Ah Xaa Arg Lys Ah Ah Ghi Glu Asp Lys Val Lys Glu Lys 
6435 6440 6445 

Pro Ah Ghi Gin Pro Gin Pro Ah Pro Ah Pro Gh Pro Gh Lys Pro 
6450 6455 6460 

Ah Pro Ah Pro Lys Pro Glu Asn Pro Ah Ghi Gh Pro Lys Ah Glu 
6465 6470 6475 6480 

Lys Pro Ah Asp Gin Gin Ah Ghi Ghi Asp Tyr Ah Arg Arg Ser Ghi 
6485 6490 6495 

Gin Ghi Tyr Asn Arg Leu Thr Gh Gh Gto Pro Pro Lys Thr Ghi Lys 
6500 6505 6510 

Pro Ah Gin Pro Ser Thr Xaa Lys lie Lys Glu Xaa Asp Glu Ser Xaa 
6515 6520 6525 

Ser Glu Asp Tyr Leu Lys Glu Gry Leu Arg Ah Pro Leu Gh Ser Lys 
6530 6535 6540 

Leu Asp Thr Lys Lys Ah Lys Leu Ser Lys Leu Ghi Ghi Leu Ser Asp 
6545 6550 6555 6560 

Lys lie Asp Ghi Leu Asp Ah Ghi He Ah Lys Leu Glu Val Gin Leu 
6565 6570 6575 

Lys Asp Ah Gh Gly Asn Asn Asn Val Glu Ah Tyr Phe Lys Glu Gry 
6580 6585 6590 

Leu Glu Lys Thr Thr Ah Ghi Lys Lys Ah Ghi Leu Ghi Lys Ah Gh 
6595 6600 6605 



148 



EP 1 477 185 A2 



Ala Asp tea Lys Lys Ala Val Asp Ghi Pro Ghi Thr Pro Ala Pro Ala 
6610 6615 6620 

Pro Ghi Pro Ala Pro Ala Pro Ohi Lys Pro Ala Glu Lys Pro Ala Pro 
6625 6630 6635 6640 

Ala Pro Ala Pro Gto Lys Pro Ala Pro Ala Pro Ghi Lys Pro Ala Pro 
6645 6650 6655 

Thr Pro Glu Lys Pro Ala Pro Thr Pro Ghi Thr Pro Lys Hit Ory Tip 
6660 6665 6670 

Lys Gin Ghi Asa Gly Met Trp Tyr Phe Tyr Asn Thr Asp Gry Ser Met 
6675 6680 6685 

AkThrGryTipLe^GlnAsnAsnGlySerTipTyrT>L«Asn^ 
6690 6695 6700 

Asn Gry Ala Met Ala Thr Gry Tip His Gin Asn Asn Gry Ser Trpiy 
6705 6710 6715 6720 

Tyr Leu Asn Ser Leu Lys Gto He Asp Glu Ser Asp Ser Ghi Asp Tyr 
6725 6730 6735 

Leu Lys Glu Gry Leu Arg Ala Pro Leu Gin Ser Lys Leu Asp HirLys 
6740 6745 6750 

Lys Ala Lys Leu Ser Lys Leu Ghi Ghi Leu Ser Asp Lys lie Asp Ghi 
6755 6760 6765 

Leu Asp Ala Ghi He Ala Lys Leu Ghi Val Gin Leu Lys Asp Ala Glu 
6770 6775 6780 

Gly Asn Asn Asn Val Ghi Ala Tyr Phe Lys Glu Gly Leu Glu Lys Hit 
6785 6790 . 6795 6800 

Hit Ala Glu Lys Lys Ala Ghi Leu Glu Lys Ala Glu Ala Asp Leu Lys 
6805 6810 6815 

Lys Ala Val Asp Ghi Pro Asp Thr Pro Ala Pro Ala Pro Gin Pro Ala 
6820 6825 6830 

Pro Ala Pro Ghi Lys Pro Ala Glu Lys Pro Ala Pro Ala Pro Ala Pro 
6835 6840 6845 

Glo Lys Pro Ala Pro Ala Pro Ghi Lys Pro Ala Pro Ala Pro Ghi Lys 
6850 6855 6860 

Pro Ala Pro Ala Pro Ghi Lys Pro Ala Pro Ala Pro Glu Lys Pro Ala 
6865 6870 6875 6880 

Pro Ala Pro Ghi Lys Pro Ala Pro Ala Pro Ghi Lys Pro Ala Pro Ala 
6885 6890 6895 

Pro Lys Pro Ghi Thr Pro Ghi Thr Arg Leu Glu Thr Arg Lys Arg Tyr 
6900 6905 6910 

Leu Lys Ghi He Asp Ghi Ser Asp Ser Glu Asp Tyr Leu Lys Ghi Gry 
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6915 6920 6925 

Lea Axg Ala Pro Lea Gin Ser Lys Leu Asp Thr Lys Lys Ala Lys Leu 
6930 6935 6940 

Ser Lys Leu Glu Ohi Leu Ser Asp Lys lie Asp Obi Leu Asp Ala Ghi 
6945 6950 6955 6960 

De Ala Lys Leu Ghi Val Gin Leu Lys Asp Ala Glu Gry Asn Asn Asn 
6965 6970 6975 

Val Glu Ala Tyr Phe Lys Ghi Gry Leu Glu Lys Thr Thr Ala Glu Lys 
6980 6985 6990 

Lys Ala Glu Leu Ghi Lys Ala Ghi Ala Asp Leu Lys Lys Ala Val Asp 
6995 7000 7005 

Glu Pro Glu Thr Pro Ala Pro Ala Pro Gin Pro Ala Pro Ala Pro Glu 
7010 7015 7020 

Lys Pro Ala Glu Lys Pro Ala Pro Ala Pro Glu Lys Pro Ala Pro Ala 
7025 7030 " 7035 7040 

* &* 

Pro Ghi Lys Pro Ala Pro Ala Pro Glu Lys Pro Ala Pro Ala Pro Ghi 
7045 7050 7055 

LysItoAlaltoAJaEtoGtuLysProAlaProTfarProGtuThrPro 
7060 7065 7070 

Lys Thr Gry ftp Lys Ghi Ghi Asa Gry Met Leu Lys Glu He Asp Ghi 
7075 7080 7085 

Ser Glu Ser Ghi Asp Tyr Ala Lys Ghi Gry Phe Arg Ala Pro Leu Gin 
7090 7095 7100 

Ser Lys Leu Asp Ala Lys Lys Ala Lys Leu Ser Lys Leu Ghi Glu Leu 
7105 7110 7115 7120 

Ser Asp Lys lie Asp Glu Leu Asp Ala Glu lie Ala Lys Leu Glu Asp 
7125 7130 7135 

Ghi Leu Lys Ala Ala Ghi Ghi Asn Asn Asn Val Ghi Asp Tyr Phu Lys 
7140 7145 7150 

Ghi Gly Leu Ghi Lys Thr De Ala Ala Lys Lys Ala Ghi Leu Glu Lys 
7155 7160 7165 

Thr Glu Ala Asp Leu Lys Lys Ala Val Asn Ghi Pro Ghi Lys Pro Ala 
7170 7175 7180 

Pro Ala Pro Glu Thr Pro Ala Pro Glu Ala Pro Ala Glu Gin Pro Lys 
7185 7190 7195 7200 

Pro Ala Pro Ah Pro Gin Pro Ala Pro Ala Pro Lys Pro Ghi Lys Pro 
7205 7210 7215 

Ala Glu Gin Pro Lys Pro Ghi Lys Thr Asp Asp Gin Gin Ala Glu Ghi 
7220 7225 7230 
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Asp Tyr Ala Arg Arg Ser GIu Gtu Gto Tyr Asn Arg Leu Thr Gin Gta 
7235 7240 7245 

Gin Pro Pro Lys Ala Ghi Lys Pro Ala Pro Ala Pro Lys Thr Gly Tip 
7250 7255 7260 

Lys Gin Ghi Asd Gly Met Tip Tyr Hie Tyr Asn Thr Asp Gry Scr Met 
7265 7270 7275 7280 

Gry Ghi Gin Ala Gry Gin Tyr Arg Ala Ala Ala Glu Gly Asp Leu Ala 
7285 7290 7295 

Ala Lys Gin Ala Ghi Leu Ghi Lys Thr Ghi Ala Asp Leu Lys Lys Ala 
7300 7305 7310 

Val Asn Ghi Pro Glu Lys Pro Ala Pro Ala Pro Ghi Thr Pro Ala Pro 
7315 7320 7325 

Glu Ala Pro Ala Ghi Gin Pro Lys Pro Ala Pro Ah Pro Gin Pro Ala 
7330 7335 7340 

Pro Ala Pro Lys Pro Glu Lys Pro Ala Ghi Gin Pro Lys Ala Ghi Lys 
7345 7350 7355 7360 

Thr Asp Asp Ghi Gin Ala Ghi Ghi Asp Tyr Ala Arg Arg Sex Ghi Glu 
7365 7370 7375 

Ghi Tyr Asn Arg Leu Thr Gin Gin Gin Pro Pro Lys Ala Glu Lys Pro 
7380 7385 7390 

Ala Pro Ala Pro Lys Pro Glu Gin Pro Ala Pro Ala Pro Lys Asn Ser 
7395 7400 7405 

Lys Gry Ghi Gbi Ala Glu Gin Tyr Arg Ser Ala Ala Gry Gry Asp Leu 
7410 7415 7420 

Ala Ala Lys Gin Val Ghi Leu Glu Lys Thr Ghi Ala Asp Leu Lys Lys 
7425 7430 7435 7440 

Ala Val Asn Glu Pro Ghi Lys Pro Ala Pro Ala Pro Ghi Thr Pro Ala 
7445 7450 7455 

Pro Glu Ala Pro Ala Glu Gin Pro Lys Pro Ala Pro Ala Pro Ghi Pro 
7460 7465 7470 

Ala Pro Ala Pro Lys Pro Ghi Lys Fro Ala Glu Gin Pro Lys Ala Glu 
7475 7480 7485 

Lys Pro Ala Asp Gin Gin Ala Ghi Ghi Asp Tyr Asp Arg Arg Ser Ghi 
7490 7495 7500 

Glu Glu Tyr Asn Arg Leu Thr Gin Gin Gin Pro Pro Lys Ala Ghi Lys 
7505 7510 7515 7520 

Pro Ala Pro Ala Pro Gh Pro Glu Gin Pro Ala Pro Ala Pro Lys Ser 
7525 7530 7535 

Leu Lys Glu He Asp Ghi Ser Asp Ser Ghi Asp Tyr Val Lys Glu Gry 
7540 7545 7550 
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Phe Arg Ala Pro Leu Gin Ser Ghi Leu Asp Ala Lys Gtn Ala Lys Leu 
7555 7560 7565 

Ser Lys Leu Glu Glu Leu Ser Asp Lys He Asp Ghi Leu Asp Ala Gtu 
7570 7575 7580 

De Ala Lys Leu Ghi Lys Asp Val Glu Asp Phe Lys Xaa Ser Asp Gry 
7585 7590 7595 7600 

Ghi Gin Ala Gty Gfai Tyr Leu Ala Ala Ala Ghi Ghi Asp Leu De Ala 
7605 7610 7615 

Lys Lys Ala Ghi Leu Ghi Gin Thr Glu Ala Asp Leu Lys Lys Ala Val 
7620 7625 7630 

Asn Ghi Pro Gty Lys Pro Ala Pro Ala Pro Ala Pro Ghi Thr Pro Ala 
7635 7640 7645 

Pro Glu Ala Pro Ala Ghi Gin Pro Lys Pro Ala Pro Glu Thr Pro Ala 
7650 7655 7660 

Pro Ala Pro Lys Pro Ghi Lys Pro Ala Ghi Gin Pro Lys Pro Glu Lys 
7665 7670 7675 7680 

Pro Ala Asp Ghi Ghi Ala Ghi Ghi Asp Tyr Ala Arg Arg Ser Glu Glu 
7685 7690 7695 

Glu Tyr Asn Arg Leu Thr Gin Gin Gin Pro Ala Pro Ala Gin Lys Pro 
7700 7705 7710 

Glu Gin Pro Ala Lys Pro Glu Lys Pro Ala Ghi Glu Pro Thr Gin Pro 
7715 7720 7725 

Ghi Lys Asp Ala GluDe Ala Lys Leu Glu Lys Asn Val Ghi Tyr Phe 
7730 7735 7740 

Lys Lys Thr Asp Ala Ghi Gin Thr Ghi Gin Tyr Leu Ala Ala Ala Glu 
7745 7750 7755 7760 

Lys Asp Leu Ala Asp Lys Lys Ala Ghi Leu Ghi Lys Thr Ghi Ala Asp 
7765 7770 7775 

Leu Lys Lys Ala Val Asn Ghi Pro Ghi Lys Pro Ala Ghi Glu Thr Pro 
7780 7785 7790 

Ala Pro Ala Pro Lys Pro Ghi Gin Pro Ala Ghi Oh Pro Lys Pro Ala 
7795 7800 7805 

Pro Ala Pro Gin Pro Ala Pro Ala Pro Lys Pro Ghi Lys Thr Asp Asp 
7810 7815 7820 

Gin Ghi Ala Ghi Glu Asp Tyr Ala Arg Arg Ser Glu Ghi Glu Tyr Asn 
7825 7830 7835 7840 

Arg Leu Pro Gin Gin Gin Pro Pro Lys Ala Ghi Lys Pro Ala Pro Ala 
7845 7850 7855 

Pro Lys Pro Glu Gbi Pro Val Pro Ala Ghi Xaa Pro Ghi Asn Pro Ala 
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7860 7865 7870 

Pro Ala Pi© Lys Pro Ala Xaa Ala Pro Gin Pro Leu Lys Pro Gin Chi 
7875 7880 7885 

Pro Ala Ghi Gin Pro Lys Pro Glo Lys Pro Glu Glu Pro Ala Gry Gin 
7890 7895 7900 

Pro Ghi Pro Ghi Lys Pro Asp Asp Ghi Gin Ala Gry Glu Asp Tyr Ala 
7905 7910 7915 7920 

Aig Arg Scr Gry Giy Oh* Tyr Asn Arg Pbe Pro Gin Gin Gin Pro Pro 
7925 7930 7935 

Lys Ala Ghi Lys Pro Ala Pro Ala Pro Lys Pro Ghi Gin Pro Val Pro 
7940 7945 7950 

Ala Pro Lys Thr Leu Leu Lys Lyi Ala Lys Leu Ala Gry Ah Lys Ser 
7955 7960 7965 

Lys Ala Ala Thr Lys Lys Ala Ghi Leu Glu Pro Ghi Leu Ghi Lys Ala 
7970 7975 7980 

Glu Ala Glu Leu Ghi Aai Leu Leu Ser Thr Leu Asp Pro Ghi Gry Lys 
7985 7990 7095 8000 

Thr Gin Asp Glu Leu Asp Lys Ghi Ala Ala Ghi Ala Gro Leu Asn Lys 
8005 8010 8015 

Lys Val Ghi Ala Leu Pro Asn Ghi Val Ser Ghi Leu Gro Glu Ghi Leu 
8020 8025 8030 

Sit Lys Leu Glu Asp Asn Leu Lys Asp Ala Ghi Thr Asn Asn Val Ghi 
8035 8040 8045 

Asp Tyr lie Lys Glu Gry Leu Ghi Ghi Ala lie Ala Thr Lys Gin Ala 
8050 8055 8060 

Ghi Leu Ghi Lys Thr Pro Lys Ghi Leu Asp Ala Ala Leu Asn Ghi Leu 
8065 8070 8075 8080 

Gry Pro Asp Gry Asp Ghi Ghi Ghi Thr Pro Pro Pro Ghi Ala Pro Ala 
8085 8090 8095 

Glu Gin Pro Ly9 Pro Glu Lys Pro Ala Ghi Ghi Thr Pro Ala Pro Ala 
8100 8105 8110 

Pro Lys Pro Glu Lys Ser Ala Asp Gin Ghi Ala Glu Ghi Asp Tyr Ala 
8115 8120 8125 

Arg Arg Ser Glu Ghi Ghi Tyr Asn Arg Leu Thr Gin Gin Gin Pro Pro 
8130 8135 8140 

Lys Ala Ghi Lys Pro Ala Pro Ab Pro Ala Pro Lys Pro Ghi Gin Pro 
8145 8150 8155 8160 

Ala Pro Ala Pro Lys Ser Arg Gry Leu Ala Thr Lys Lys Lys Leu Asn 
8165 8170 8175 
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Leu Ala Ghi Ala Aig Be Glu Leu Leu Leu Lys Lys Leu Gry Leu Glu 
8180 8185 8190 

ProGtyl^GluLysAlaGlyAlaGlyLcuGlyAsnLcoLcuSerThr 
8195 8200 8205 

Leu Asp Pro Ghi Gly Lys Thr Gin Asp Ghi Leu Asp Lys Ghi Ala Ala 
8210 8215 8220 

Oh Ala Ghi Leu Asn Lys Lys Val Glu Ala Leu Pro Asn Gin Val Ala 
8225 8230 8235 8240 

Glu Leu Ghi Ghi Ghi Leu Ser Lys Leu Ghi Asp Asn Leu Lys Asp Ala 
8245 8250 8255 

Ohi Un* Asn His Val Ghi Asp Tyr lie Lys Glu Gly Leu Glu Glu Ala 
8260 8265 8270 

Be Ala Thr Lys Gin Ala Ghi Leu Glu Lys Thr Pro Lys Glu Leu Asp 
8275 8280 8285 

Ala Ala Leu Asn Ghi Leu Gry Pro Asp Gly Asp Glu Glu Ghi Thr Pro 
8290 8295 8300 

Ala Pro Glu Ala Pro Ala Ghi Gin Pro Lys Pro Ghi Lys Pro Ala Glu 
8305 8310 8315 8320 

Ghi Thr Pro Ah Pro Ala Pro Lys Pro Ohi Lys Ser Ala Asp Gin Gin 
8325 8330 8335 

Ala Ghi G hi Asp Tyt Ala Atg Arg Ser Glu Glu G hi Tyc Asn Arg Leu 
8340 8345 8350 

Thr Gin Ghi Ghi Pro Pro Lys Ala Ghi Lys Pro Ala Pro Ala Pro Ala 
8355 8360 8365 

Pro Lys Pro Glu Gfa Fro Ala Pro Ala Pro Lys Lys Lys Gin Lys Val 
8370 8375 8380 

Asn Leu Ghi Asn Leu Leu Ser Thr Leu Asp Pro Gry Gry Lys Hit Gin 
8385 8390 8395 8400 

Asp Glu Leu Asp Lys Gry Ala Ala Ghi Ala Glu Leu Asn Lys Lys Val 
8405 8410 8415 

Glu Ala Leu Pro Asn Pro Val Xaa Ghi Leu Glu Glu Ghi Leu Ser Pro 
8420 8425 8430 

Pro Glu Asp Asn Leu Lys Asp Ala Glu Thr Asn His Val Glu Asp Tyr 
8435 8440 8445 

lie Lys Ghi Gly Leu Ghi Ghi Ala He Ala Thr Lys Gin Ala Glu Leu 
8450 8455 8460 

Glu Ghi Thr Pro Gb Ghi Val Asp Ala Ala Leu Asn Asp Leu Val Pro 
8465 8470 8475 8480 

Asp Gry Gry Glu Ghi Ghi Thr Pro AJa Pro Ala Pro Gin Pro Asp Ghi 
8485 8490 8495 
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Pro Ala Pro Ala Pro Ala Pro Asn Ala Ghi Gin Pro Ala Pro Ala Pro 
8500 8505 8510 

Lys Pro Ghi Lys Ser Ala Asp Gin Gin Ala Glu Glu Asp Tyr Ala Arg 
8515 8520 8525 

Arg Ser Ghi Gry Ohi Tyr Aso Arg Leu Thx Gin Ghi Gin Pro Pro Lys 
8530 8535 8540 

Ala Glu Lys Pro Ah Pro Ala Pro Ala Pro Lys Pro Glu Gin Pro Ala 
8545 8550 8555 8560 

Pro Ala Pro Asn Lys Glu De Ala Arg Leu Gin Scr Asp Leu Lys Asp 
8565 8570 8575 

Ala Glu Goi Asn Asn Val Ghi Asp Tyr tie Lys Glu Gry Leu Glu Gb 
8580 8585 8590 

Ala Be Thr Asn Lys Lys Ala Ghi Leu Ala Thr Thr Gbi Gbi Asn lie 
8595 8600 8605 

Asp Lys Thr Gin Lys Asp Leu Glu Asp Ala Glu Leu Ghi Leu Glu Lys 
8610 8615 8620 

Val Leu Ala Tbx Leu Asp Pro Ghi Gry Lys Thr Gbi Asp Gin Leu Asp 
8625 8630 8635 8640 

Lys Glu Ala Ala Ghi Ala Gb Leu Asn Ghi Lys Val Ghi Ala Leu Ghi 
8645 8650 8655 

Asn Gin Val Ala Ghi Leu Ghi Ghi Ghi Leo Ser Lys Leu Ghi Asp Asn 
8660 8665 8670 

Leu Lys Asp Ala Glu Thr Asn Asn Val Glu Asp Tyr He Lys Glu Gry 
8675 8680 8685 

Leu Glu Ghi AJa Be Ala Ttjt Lys Lys Ak Glu 1^ Glu Lys Thr Gin 
8690 8695 8700 

Lys Ghi Leu Asp Ala Ala Leu Asn Ghi Leu Gry Pro Asp Gry Asp Ghi 
8705 8710 8715 8720 

Ghi Gb Thr Pro Ate Pro Ala Pro Gb Pro Ghi Lys Pro Ala Glu Ghi 
8725 8730 8735 

Pro Ghi Asn Pro Ala Pro Ala Pro Lys Pro Ghi Lys Ser Ala Asp Gb 
8740 8745 8750 

Gin Ala Ghi Ghi Asp Tyr Ala Arg Arg Ser Ghi Ghi Ghi Tyr Asn Arg 
8755 8760 8765 

Leu Hit Gin G In Ghi Pro Pro Lys Ala Glu Lys Pro Ala Pro Ala Pro 
8770 8775 8780 

Gto Pro Glu Gb Pro Ala Pro Ala Pro Lys lie Ghi Leu Lys Glu lie 
8785 8790 8795 8800 

Asp Ghi Ser Glu Ser Glu Asp Tyr Ala Lys Gb Gry Phe Arg Ala Pro 
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8805 8810 881S 

Leu His Ser Lys Len Asp Ala Lys Lys Ala Lys Leu Ser Lys Leu Glu 
5 8820 8825 8830 

Glu Leo Ser Asp Lys De Asp Ghi Lea Asp Ah Gin lie Ato Lys Lea 
8835 8840 8845 

10 Gh Asp Gin Leu Lys Ala Val Glu Gtu Asn Asn Asn Val Ghj Asp Tyr 

8850 8855 8860 

Ser Thr GJu Gly Leu Ghi Lys Tnr lie Ala Ah Lys Lys Hit Ghi Leu 
8865 8870 8875 8880 

« Ghi Lys Thr Ghi Ala Asp Leu Lys Lys Ala Val Asn Ghi Pro Ghi Lys 

8885 8890 8895 

Ser Ala Ghi Glu Pro Ser Gin Pro Gin Lys Pro Ala Ghi Ghi Ala Pro 
8900 8905 8910 



20 



25 



30 



35 



40 



Ala Pro Glu Gin Pro Thr Glu Pro Thr Gin Pro Ghi Lys Pro Ala Glu 
8915 8920 8925 

Glu Thr Pro Ala Pro Lys Pro Glu Lys Pro Ala Ghi Gin Pro Asn Ala 
8930 8935 8940 

Ghi Lys Thr Asp Asp Gin Gto Ala Glu Glu Asp Tyr Ala Arg Arg Ser 
8945 8950 8955 8960 

Ghi Ghi Glu Tyr Am Arg Leu Thr Gin Gin Gin Pro Pro Lys Ala Glu 
8965 8970 8975 

Lys Pro Ala Pro Ala Pro Ghi Pro Ghi Gin Thr Ser Ser Leu His 
8980 8985 8990 

(2) INFORMATION FOR SEQ ID N033: 

CD SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1453 base pairs 

(B) TYPE: nucleic acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: amino acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

TTGACAAATA TTTACGGAGG AGGCTTATGC TTAATATAAG TATAGGCTAA AAATGATTAT 60 

CAGAAAAGAG GTAAATTTAG ATGAATAAGA AAAAAATGAT TTTAACAAGC CTAGCCAGCG 
so 120 

TCGCTATCTT AGGGGCTGGT TTTGTTGCGT CTTCGCCTAC TTTTGTAAG A GCAGAAGAAO 1 80 
CTCCTGTAGC TAACCAGTCT AAAGCTGAGA AAGACTATG A TGCAGCAGTG AAAAAATCTG 
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AAGCTGCTAA GAAAGATTAC GAAACGGCTA AAAAGAAAGC AGAAGACGCT CAGAAGAAAT 
300 

ATGATGAGGA TCAG AAGAAA ACTGAGGCAA AAGCGGAAAA AGAAAGAAAA GCTTCTGAAA 
360 

AGATAGCTGA GGCAACAAAA GAAGTTCAAC AAGCGTACCT AGCTTATCTA CAAGCTAGCA 
420 

ACGAAAGTCA GAGAAAAGAG GCAGATAAGA AGATAAAAG A AGCTACGCAC GCAAAGATGA 
480 

GGCGGACGTG CAATTTGACT ATCGAATTCG AACAACAATT GTACTTCCTG AACCAAGTOA 540 

GTTACCTGAG ACTAAOAAAA AAGGAGAAGA GGCAACAAAA GAAGCAGAAG TATCTAAGAA 
600 

AAAATCTGAA GAGGCAGCTA AAGAGGTATA AGTATAGAAA AATAAAATAC TTGAACAAGA 
660 

TGCTGAAAAC GAAAAGAAAA TTOACGTACT TCAAAACAAA GTOGCTGATT TATAAAAAGG 
720 

AATTGCTCTC CATCAAAACA GTCGCTGAAT TAAATAAAGA AATTGCTAGA CTTCAAAGCG 780 

ATTTAAAAGA TGCTGAAGAA AATAATGTAG AAGACTACAT TAAAGAAGGT TTAGAGCAAG 
840 

CTATCACTAA TAAAAAAGCT GAATTAGCTA CAACTCAACA AAACATAGAT AAAACTCAAA 
900 

AAGATTTAGA GGATGCTGAA TTAGAACTTG AAAAAGTATT AGCTACATTA GACCCTGAAG 
960 

GTA AAACTCA AGATGAATTA GATAAAGAAG CTGCTGAAGC TGAGTTGAAT GAAAAAGTTG 
1020 

AAGCTCTTCA AAACCAAGTT GCTGAATTAG AAGAAGAACT TTCAAAACTT GAAGATAATC 
1080 

TTAAAGATGC TGAAACAAAC AACGTTGAAG ACTACATTAA AGAAGGTTTA GAAGAAGCTA 
1140 

TOGCG ACTAA AAAAGCTOAA TTGGAAAAAA CTCAAAAAG A ATTAGATGCA GCTCTTAATG 
1200 

AGTTAGGCCC TGATGGAGAT GAAGAAGAGA CTCCAGCGCC GGCTCCTCAA CCAGAAAAAC 
1260 

CAGCTGAAGA GCCTGAGAAT CCAGCTCCAG CACCAAAACC AGAGAAGTCA GCAGATCAAC 
1320 

AAGCTGAAGA AGACTATGCT CGTAGATCAG AAGAAGAATA TAATCGCTTG ACCCAACAGC 
1380 

AACCGCCAAA AGCAGAAAAA CCAGCTCCTG CACCACAACC AG AGCAACCA GCTCCTGCAC 
1440 
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CAAAAATAGA GGC 1453 
(2) INHUMATION FOR SEQ ID NCh34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1241 amino adds 

(B) TYPE: amino add 

(C) STOANDEDNESS: single 

(D) TOPOLOGY: finear 

(d) MOLECULE TYPE: amino acid 



(3d) SEQUENCE DESCRIPTION: SEQ ID N034: 

Met Ghi Hit Ala Ser Aso Leo Tyr Ser Leu Tyr Ser Leu Tyr Ser Met 
15 10 15 

GhiThrlloLeuGluLeuGluTbrHisArgSerGluArgLeuGhiAJa 
20 25 30 

LeuAlaSerGluAi^VdAkLeuAlaLeaAlalleLcuGhaLeuGlu 
35 40 45 

GlyLeuTyrAlal^AlflGlyl^uTVrProHhGluValAlaLeuAla 
50 55 60 

Lea Ala Ser Ght Arg Ser Glu Arg Vro Aig Hsr His Arg Pro His Ghi 
65 70 75 80 

Val Ala Leu Ala Arg Gly Ala Leu Ala Gly Leu Gly Leo Ala Leu Ala 
85 90 95 

Pro Aig Val Ala Leo Ala Leu Ala Ala Ser Asn Gly Leu Am Ser Glu 
100 105 110 

Aig Leu Tyr Ser Ala Leu Ala Gly Leu Leo Tyr Ser Ala Ser Pro Thr 
115 120 125 

Tyr Arg Ala Ser Pro Ala Leu Ala Ala Leu Ala Val Ala Leu Leu Tyr 
130 135 140 

Ser Leu Tyr Ser Ser Glu Arg Ofy Leu Ala Leu Ak Ala Leu Ala Leu 
145 150 155 160 

Tyr Ser Leu Tyr Ser Ala Ser Pro Tor Tyr Arg Gly Leu Thr His Arg 
165 170 175 

AJa Leu Ala Leu Tyr Ser Leu Tyr Ser Leu Tyr Ser Ala Leu Ala Gry 
ISO 185 190 

Leu Ala Ser Pro Ala Leu Ala G ry Leu Asn Leu Tyr Ser Leu Tyr Ser 
195 200 205 

Tnr Tyr Arg Ala Ser Pro Gly Leu Ala Ser Pro Gly Leu Asu Leu Tyr 
210 215 220 



158 



EP1477185 A2 



Ser Lea Tyr Ser Tfar His Aig Gly Leu Ab Leu Ala Leu Tyr Scr Ala 
225 230 235 240 

I^AlaGlyLeuI^TyrScrGlyLeuAlaArgGlyLeuTyrSerAla 
245 250 255 

Leu Ala Ser Ghi Afg Gry Leu Leu Tyr Ser He Leu Otu Ala Leu Ala 
260 265 270 

GfyLeuAlaLeuAkThrHisAiBLeuTyScrGlyLeoValAh^ 
275 280 285 

Gly Leu Asa Gly Leo Asn Ala Leu Ala Thr Tyr Arg Leu Ghi Ala Leu 
290 295 300 

Ala Thr Tyr Arg Len Ghi Gly Leu Asn Ala Leu Ala Ser Ghi Arg Ala 
305 310 315 320 

Ser Asn Gly Leu Ser Ghi Arg Gly Leu Asa Ah Arg Gly Leu Tyr Scr 

325 330 335 

G ry Leu Ala Lea Ala AJa Ser Pro Leu Tyr Ser Leu Tyr Ser De Leo 
340 345 350 

Glu Leu Tyr Ser Gry Leu Ala Leu Ala Thr His Arg IBs lie Ser Ala 
355 360 365 

Leu Ala Leo Tyr Ser Met Glu Thr Ala Arg G ly Ala Aig Gly Thr His 
370 375 380 

Arg Cys Tyr Ser Ah Ser Asn Leu Glu Thr His Arg He Leu Ghi Gly 
385 390 395 400 

Leu Pro His Ghi Gly Leu Gly Leu Asn Gly Leu Asn Leu Glu Thr Tyr 
405 410 415 

Arg Pro His Ghi Leu Ghi Ah Ser Asn Gry Leu Asn Val Ala Leu Ser 
420 425 430 

Glu Arg Thr Tyr Arg Leu Ghi Ala Arg Gly Leu Ghi Ala Arg Gly Leu 
435 440 445 

iyr Ser Leu Tyr Ser Gry Leu Asn Leu Tyr Ser Ah Arg Gry Gry Leu 
450 455 460 

Asn Gry Leu Asn Leu Tyr Ser Leu Tyr Ser Gry Leu Asn Leu Tyr Ser 
465 470 475 480 

Thr Tyr Arg Leu Glu Ah Arg Gry Leu Tyr Ser Ah Ser Asn Leu Glu 
485 490 495 

Leu Tyr Ser Ah Arg Gry Gly Leu Asn Leu Ghi Leu Tyr Ser Ala Arg 
500 505 510 

G ly Thr Tyr Arg Leu Tyr Ser Thr Tyr Arg Ah Arg Gly Leu TyrSer 
515 520 525 

He Leu Ghi Leu Tyr Ser Thr Tyr Arg Leu Glu Ah Ser Asn Leu Tyr 
530 535 540 



159 



EP 1477 185 A2 



SerMctGhiThrLeoGhiLeu.TyrSerThrHisArgLeuTyrSerAla 
545 550 555 560 

Arg Gly Leo Tyr Ser Leu Gfai Thr His Aig Thr Tyr Arg Pro His Ghi 
565 570 575 

Leu Tyr Ser Hit His Arg Leu Tyr Ser Ser Glu Arg Lea Ghi nc Lea 
580 585 590 

Ghi Hit Tyr Arg Leu Tyr Ser Leu Tyr Sex Gly Leu Leu Ghi Lea Gin 
595 600 605 

Ser Gin Arg lie Leu Ghi Leu Tyr Ser Tbs His Aig Val Ala Leu Ala 
610 615 620 

Leu Ala Gry Leu Leu Ghi Ala Set Asn Leu Tyr Ser Gly Leu De Leu 
625 630 635 640 

Ghi Ala Leu Ah Ala Aig Gly Leu Ghi Gly Leo Asn Ser Ghi Arg Ala 
645 650 655 

Ser Pro Leu Ghi Leu Tyr Ser Ala Ser Pro Ala Leu Ala Gry Leu Gry 
660 665 670 

Leu Ala Ser Asa Ak Ser Asa Val Ala Leu Gry Leu Ala Scr Pro Thr 
675 680 685 

Tyr Arg lie Leu Ghi Leu Tyr Ser Gry Leu Gry Leu Tyr Leu Ghi Gry 
690 695 700 

Leu Gry Leu Asn Ala Leu Ala He Leu Ghi Thr His Arg Ala Ser Asn 
705 710 715 720 

Leu Tyr Ser Leu Tyr Ser Ala Leu Ala Gly Leu Leu Ghi Ala Leu Ala 
725 730 735 

Thr His Arg Thr His Arg Gry Leu Asn Gry Leu Asn Ala Ser Asn lie 
740 745 750 

Leu Ghi Ala Ser Pro Leu Tyr Ser Tlnr His Arg Gry Leu Asn Leu Tyr 
755 760 765 

SerAlaSerProLeuGhi Gry Leu Ala Ser Pro Ala Leu Ala Gry Leu 

770 775 780 

Leu Ghi Gly Leu Leu Ghi Gry Leu Leu Tyr Ser Val Ala Leu Leu Glu 
785 790 795 800 

Ala Lea Ala Thr His Arg Leu Ghi Ala Ser Pro Pro Arg Gry Leu Gry 
805 810 815 

Leu Tyr Leu Tyr Ser Thr His Arg Gry Leu Asn Ala Ser Pro Gly Leu 
820 825 830 

Leu Glu Ala Ser Pro Leu Tyr Ser Gry Leu Ak Leu Ala Ala Leu Ala 
835 840 845 

Gry Leu Ala Leu Ala Ghy Leu Leu Ghi Ala Ser Asn Gry Leu Leu Tyr 
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850 855 860 

Ser Val Ala Leu Gly Leu Ala Leo Ala Leu Chi Gly Leo A$n Ala Ser 
865 870 875 880 

Asn Gly Lew Am Val Ah Leo Ala Leu Ala Gly Leu Leu Glu Gly Leu 
885 890 895 

Gly Leu Gly Leu Leu Ghi Sec Ghi Arg Leu Tyr Ser Leu Glu Gly Leu 
900 905 910 

Ala Ser Pro Ala Ser Asn Leu Ghi Leu Tyr Ser Ala Ser Pro Ala Leu 
915 920 925 

Ala Gry Leu Thr His Arg Ala Ser Asn Ala Ser Asa Val Ah Leu Gly 
930 935 940 

Leu Ah Ser Pro Thr Tyr Arg He Leu Glu Leu Tyr Ser Gly Leu Gry 
945 950 955 960 

Leu Tyr Leu Ghi Gly Leu Gly Leu Ah Leu Ala He Leu Ghi Ah Leu 
965 970 - 975 

Ah Thr His Arg Leo Tyr Ser Leu 1>r Ser Ah Leu Ah Gly Leu Leu 
980 985 990 

Ghi Gly Leu Leu Tyt Ser Thr His Arg Gly Leu Asn Leu Tyr Ser Gry 
995 1000 1005 

Leu Leu Glu Ah Ser Pro Ah Leu Ah Ah Leu Ah Leu Ghi Ah Ser 
1010 1015 1020 

Asn Gry Leu Leu Gh G ry Leu Tyr Pro Arg Ah Ser Pro Gry Leu Tyr 
1025 1030 1035 1040 

Ah Ser Pro Gry Leu Gry Leu Gry Leu Thr His Arg Pro Arg Ala Leu 
1045 1050 1055 

Ah Pro Arg Ala Lea Ala Pro Arg Gry Leu Asn Pro Arg Gly Leu Leu 
1060 1065 1070 

Tyr Ser Pro Arg Ah Leu Ah Gfy Leu Gly Leu Pro Arg Gly Leu Ah 
1075 1080 1085 

Ser Asn Pro Arg Ala Leu Ala Pro Arg Ala Leu Ala Pro Arg Lau Tyr 
1090 1095 1100 

Ser Pro Arg Gry Leu Leu Tyr Ser Ser Ghi Arg Ah Leu Ah Ah Ser 
1105 1110 1115 1120 

Pro Gry Leu Asn Gry Leu Asn Ah Leu Ah Gry Leu Gry Leu Ah Ser 
1125 1130 1135 

Pro Thr Tyr Arg Ah Leu Ala Ala Arg Gly Ah Arg Gly Ser Ghi Arg 
1140 1145 1150 

Gry Leu Gry Leu Gry Leu Thr Tyr Arg Ah Ser Asn Ah Arg Gry Leu 
1155 1160 1165 
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Glu Thr His Arg Gly Leu Asn Gly Leu Asn Gly Leu Am Pro Arg Pro 
1170 1175 1180 

Arg Leu Tyr Ser Ala Leu Ala Gly Leu Leo Tyr Ser Pro Arg Ala Leu 
1185 1190 1195 1200 

Ala Pro Arg Ala Leu Ah Pro Arg Gly Leu Asn Pro Arg Gly Leu Gry 
1205 1210 1215 

Leu Asn Pro Arg Ala Leu Ala Pro Arg Ala Leu Ala Pro Arg Leu Tyr 
1220 1225 1230 

Ser lie Leu Ghi Gry Leu Ala Leo Ala 
1235 1240 

(2) INFORMATION FORSEQID NO:35: 

® SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(K) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ DO N035: 

AAGCTTATGA TATAGAAATT TGTAAC AAAA ATGTAATATA AAACACTTOA CAAATATTTA 60 

CGGAGGAGGC TTATACTTAA TATAAGTATA GTCTGAAAAT G ACTATCAGA AAAGAGGTAA 
120 

ATTTAGATGA ATAAGAAAAA AATGATTTTA ACAAGTCTAG CCAGCGTCGC TATCTTAGGG 
180 

GCTGGTTTTG TTGCGTCTCA GCCTACTGTT GTAAG AGCAG AAGAATCTCC CGTAGCCAGT 240 

CAGTCTAAAG CTGAOAAAGA CTATGATGCA GCGAAGAAAG ATGCTAAGAA TGCGAAAAAA 
300 

GCAGTAGAAG ATGCTCAAAA GGCTTTAGAT GATGCAAAAG CTGCTCAGAA AAAATATGAC 
360 

GAGGATCAGA AGAAAACTGA GGAGAAAGCC GCGCTAGAAA AAGCAGCGTC TGAAGAGATG 
420 

GATAAGGCAG TGGCAGCAGT TCAACAAGCG TATCTAGCCT ATCAACAAGC TACAGACAAA 
4S0 

GCCGCAAAAG ACGCAGCAGA TAAGATGATA GATGAAGCTA AGAAACGCGA AGAAGAGGCA 
540 

AAAACTAAAT TTAATACTGT TCGAGCAATG GTAGTTCCTG AGCCAGAGCA GTTGGCTGAG 
600 
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ACTAAGAAAA AATCAGAAOA AGCTAAACAA AAAGCACCAG AACTTACTAA AAAACTAGAA 
660 

GAAGCTAAAG CAAAATTAGA AGAGGCTGAG AAAAAAGCTA CTGAAGCCAA ACAAAAAGTG 
720 

GATGCTGAAG AAGTOGCTCC TCAAGCTAAA ATCGCTGAAT TGGAAAATCA AGTTCATAGA 
780 

CTAGAACAAG AGCTCAAAGA GATTGATGAG TCTGAATCAG AAGATTATGC TAAAGAAGGT 
840 

TTCCGTGCTC CTCTTCAATC TAAATTGGAT GCCAAAAAAG CTAAACTATC AAAACTTGAA 900 

GAGTTAAGTG ATAAGATTGA TGAGTTAGAC GCTGAAATTG GAAAACTTGA AGATCAACTT 
960 

AAAGCTGCTG AAG AAAACAA TAATGTAGAA GACTACTTTA AAGAAGGTTT AGAO AAAACT 
1020 

ATTGCTGCTA AAAAAGCTGA ATTAGAAAAA ACTGAAGCTG ACCTTAAGAA AGCAGTTAAT 
1080 

GAGCCAGAAA AACCAGCTCC AGCTCCAGAA ACTCCAGOCC CAGAAGCACC AGCTGAACAA 
1140 

CCAAAACCAG OGCCGGCTCC TCAACCAGCT CCCGCACCAA AACCAGAGAA GOCAGCTGAA 
1200 

CAACCAAAAC CAGAAAAAAC AGATGATCAA CAAGCTGAAG AAGACTATGC TCGTAGATCA 
1260 

G AAGAAG AAT ATAATCGCTT GACTCAACAG CAACCGCCAA AAGCTGAAAA ACCAGCTCCT 
1320 

GGACCAAAAA CAGGCTGGAA ACAAGAAAAC GGTATGTGGT ACTTCTACAA TACTGATGGT 
1380 

TCAATGGCGA CAGGATGGCT CCAAAACAAC GGTTCATGGT ACTACCTCAA CAGCAATGGT 
1440 

GCTATGGCTA CAGGTTGGCT CCAATACAAT GGTTCATGGT ATTACCTCAA CGCTAACGGC 
1500 

GCTATGGCAA CAGGTTGGGC TAAAGTCAAC GGTTCATGGT ACTACCTCAA CGCTAATGGT 
1560 

GCTATGGCTA CAGGTTGGCT CCAATACAAC GGTTCATGGT ATTACCTCAA OGCTAACGGC 
1620 

GCTATGGCAA CAGGTTGGGC TAAAGTCAAC GGTTCATGGT ACTACCTCAA CGCTAATGGT 
1680 

GCTATGGCTA CAGGTTGGCT CCAATACAAC GGTTCATGGT ACTACCTCAA CGCTAACGGT 
1740 

GCTATGGCTA CAGGTTGGGC TAAAGTCAAC GGTTCATGGT ACTACCTCAA CGCTAATGGT 
1800 
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GCTATCGCAA CAGOTTGGOT GAAAOATOCA GATACCTGGT ACTATCTTQA AGCATCAGGT 
1860 

GCTATGAAAG CAAGCCAATO GTTCAAAGTA TCAGATAAAT GGTACTATCT CAATGGTTTA 
1920 

GGTGCOCTTG CAGTCAACAC AACTGTAG AT GGCTATAAAG TCAATX3CCAA TGGTGAATGG 
1980 

GTTTAAGCCG 1990 
(2) INFORMATION FOR SEQ ID N036: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 956 base paire 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(m) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

CCAGCGTCGC TATCTTAGGG GCTGGTTTTG TTGCGTCTCA GCCTACTGTT GTAAGAGCAG 60 

AAOAATCTCC CGTAGCCAGT CAGTCTAAAG CTGAGAAAGA CTATG ATGCA GCGAAGAAAG 
120 

ATGCTAAGAA TGCGAAAAAA GCAGTAGAAG ATGCTCAAAA GGCTTTAGAT GATGCAAAAG 
180 

CTGCTCAGAA AAAATATGAC GAGGATCAGA AGAAAACTGA GGAGAAAGCC GCGCTAGAAA 
240 

AAGCAGCGTC TGAAGAGATG GATAAGGCAG TGGCAGCAGT TCAACAAGCG TATCTACCCT 
300 

ATCAACAAGC TACAGACAAA GCCGCAAAAG ACGCAGCAGA TAAGATGATA GATGAAGCTA 
360 

AGAAACGCGA AGAAGAGGCA AAAACTAAAT TTAATACTGT TCGAGCAATG GTAGTTCCTG 
420 

AGCCAGAGCA GTTGGCTG AG ACTAAGAAAA AATCAGAAG A AGCTAAACAA AAAGCACCAG 
480 

AACTT ACTAA AAAACTAGAA GAAGCTAAAG CAAAATTAGA AGACGCTGAG AAAAAAGCTA 
540 

CTGAAGCCAA ACAAAAAGTG GATGCTGAAG AAGTCGCTCC TCAAGCTAAA ATCGCTGAAT 
600 

TGGAAAATCA AGTTCATAGA CTAGAACAAG ACTCAAAGAG ATTGATGAGT CTGAATCAGA 
660 

AGATTATGCT AAAGAAGGTT TCCGTGCTCC TCTTCAATCT AAATTGGATG CCAAAAAAGC 720 
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TAAACTATC A AAACTTGAAG AGTTAAGTGA TAAGATTGAT GAGTTAGACG CTGAAATTGC 
780 

AAAACTTGAA GATCAACTTA AAGCTGCTGA AG AAAACAAT AATGTAGAAG ACTACTTTAA 
840 

AGAAGGTTTA GAGAAAACTA TTGCTGCTAA AAAAGCTGAA TTAGAAAAAA CTGAAGCTGA 
900 

OCTTAAGAAA GCAGTTAATG AGCCAGAAAA ACCAGCTCCA GCTCCAGAAA CTCCAG 956 
(2) INFORMATION FOR SEQ ID N037: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic add 

(Q STRANDEDN6SS: single 
(D) TOPOLOGY: linear 

<n) MOLECULE TYPE: DNA (genomic) 



(3d) SEQUENCE DESCRIPTION: SEQ ID NCfc37: 
GGAAGGCCAT ATGCTCAAAG AGATTGATGA GTCT 34 
(2) INFORMATION FOR SEQ ID N038: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOS8: 
CCAAGGATCC TTAAACCCAT TCACCATTGG C 3 1 

(2) INFORMATION FOR SEQ ID N039: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3222 base pairs 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N039: 

AAOCTTATGC TTGTCAATAA TCACAAATAT GTAGATCATA TCTTGTTTAG GACAGTAAAA 60 

CATCCTAATT ACTTTTTAAA TATTTTACCT GAGTTCATTG GCTTGACCTT GTTGAGTCAT 120 

GCCTATATCA CTTTTGTITT AGTTTTTCCA GTTTATGCAG TTATTTTGTA TCGACG AATA 1 80 

GCTGAAGAGG AAAAGTTATT ACATGAAGTT ATAATCCCAA ATGGAAGCAT AAAGAGATAA 
240 

ATACAAAATT CGATTTATAT ACAGTTCATA TTGAAGTGAT ATAGTAAGGT TAAAGAAAAA 
300 

ATATAGAAGG AAATAAACAT GTTTGCATCA AAAAGCGAAA GAAAAGTACA TTATTCAATT 
360 

CGTAAATTTA GTATTGGAGT AGCTAGTGTA GCTGTTGCCA GCTTGTTCTT AGO AGG AGTA 420 

GTCCATGCAG AAGGGGTTAG AAGTGGGAAT AACCTCACGG TTACATCTAG TGGGCAAGAT 
480 

ATATCG AAGA AGTATGCTG A TGAAGTCGAG TCGCATCTAG AAAGTATATT GAAGG ATGTC 
540 

AAAAAAAATT TGAAAAAAGT TCAAAAAGAA AAAGATCGCC GTAACTACCC AACCATTACT 
600 

TACAAAACGC TTGAACTTGA AATTGCTGAG TOCGATGTGG AAGTTAAAAA AGCGGAGCTT 
660 

GAACTAGTAA AAGTGAAAGC TAAGGAATCT CAAG ACGAGG AAAAAATTAA GCAAGCAGAA 
720 

GCGGAAGTTG AGAGTAAACA AGCTG AGGCT ACAAGGTTAA AAAAAATCAA GAC AGATOGT 
780 

GAAG AAGCTA AACGAAAAGC AGATGCTAAG TTGAAGGAAG CTGTTGAAAA GAATGTAGOG 
840 

ACTTCAGAGC AAG ATAAACC AAAGAGGCGG GCAAAACGAG GAGTTTCTGG AGAGCTAGCA 
900 

ACACCTGATA AAAAAGAAAA TGATGCGAAG TCTTCAGATT CTAGCGTAGG TG AAGAAACT 
960 

CTTCCAAGCC CATCCCTTAA TATGGCAAAT GAAAGTCAGA cagaacatag OAAAOATGTC 
1020 

GATG AATATA TAAAAAAAAT GTTGAGTGAG ATCCAATTAG ATAOAAGAAA ACATACCCAA 
1080 

AATGTCAACT TAAACATAAA GTTGAGCGCA ATTAAAACGA AGTATTTGTA TGAATTAAGT 
1140 

GTTTTAAAAG AGAACTOGAA AAAAGAAGAG TTGACGTCAA AAACCAAAGC AGAGTTAACC 
1200 
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OCAGCTTTTG AGCAGTTTAA AAAAGATACA TTGAAACCAO AAAAAAAGGT AGCAGAAGCT 
1260 

0 AGAAOAAOG TTGAAGAAOC TAAGAAAAAA GCCAAGGATC AAAAAGAAGA AGATOGCCGT 
1320 

AACTACCCAA CCAATACTTA CAAAACGCTT GAACTTGAAA TTGCTGAGTC CGATGTGAAA 
1380 

GTTAAAGAAG CGGAGCTTGA ACTAGTAAAA GAGGAAGCTA ACGAATCTCG AAACGAGGAA 
1440 

AAAATTAAGC AAGCAAAAG A GAAAGTTGAG AGTAAAAAAG CTGAGGCTAC AAGGTTAGAA 
1500 

AAAATCAAGA CAGATCGTAA AAAAGCAOAA GAAGAAGCTA AACGAAAAGC AGAAGAATCT 
1560 

GAGAAAAAAG CTGCTGAAGC CAAACAAAAA GTGGATGCTG AAGAATATGC TCTTGAAGCT 
1620 

AAAATCGCTG AGTTGGAATA TGAAGTTCAG AGACTAGAAA AAGAGCTCAA AGAGATTGAT 
1680 

GAGTCTGACT CAGAAGATTA TCTTAAAGAA GGCCTCCGTG CTCCTCTTCA ATCTAAATTG 1740 

GATACCAAAA AAGCTAAACT ATCAAAACTT GAAGAGTTGA GTGATAAGAT TGATGAGTTA 
1800 

GACGCTG AAA TTGCAAAACT TGAAGTTCAA CTTAAAGATO CTGAAGGAAA CAATAATGTA 
1860 

GAAGCCTACT TTAAAGAAGG TTTAGAGAAA ACTACTGCTG AG AAAAAAGC TO AATTAGAA 
1920 

AAAGCTGAAG CTGACCTTAA GAAAGCAGTT GATGAGCCAG AAACTCCAGC TCCGGCTCCT 
1980 

CAAOCAGCTC CAGCTCCAG A AAAACCAGCT GAAAAACCAG CTCCAGCTCC AG AAAAACCA 
2040 

GCTCCAGCTC CAGAAAAACC AGCTCCAGCT OGAGAAAAAC CAGCTCCAGC TCCAGAAAAA 
2100 

CCAGCTCCAG CTCCAGAAAA ACCAGCTCCA ACTCCAGAAA CTCCAAAAAC AGGCTGGAAA 
2160 

CAAGAAAACG GTATGTGGTA CTTCTACAAT ACTGATGGTT CAATGGCAAC AGGCTGGCTC 
2220 

CAAAACAATG GCTCATGGTA CTACCTCAAC AGCAATGGCG CTATGGCGAC AGGATGGCTC 
2280 

CAAAACAATG GCTCATGGTA CTACCTCAAC AGCAATGGCG CTATGGCGAC AGGATGGCTC 
2340 

CAATACAATG GTTCATGGTA CTACCTCAAC GCTAATGGTG ATATGGCGAC AGGATGGCTC 
2400 
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CAATACAATG GTTCATGGTA CTACCTCAAC GCTAATGGTG ATATGGCGAC AGGATGGTTC 
2460 

CAATACAATO GTTCATGGTA CTACCTCAAC GCTAATGGTG ATATGGCGAC AGGATGGTTC 
2520 

CAATACAATG GTTCATGGTA CTACCTCAAC GCTAATGGTG ATATGGCGAC AGGATGGCTC 
2580 

CAATACAATG GTTCATGGTA CTACCTAAAC AGCAATGGTG CTATGGTAAC AGGATGGCTC 
2640 

CAAAACAATG GCTCATGGTA CTACCTAAAC GCTAACGGTT CAATGGCAAC AGATTGGGTG 
2700 

AAAGATGGAG ATACCTGGTA CTATCTTGAA GCATCAGGTG CTATGAAAGC AAGCCAATGG 
2760 

TTCAAAGTAT CAGATAAATG GTACTATGTC AATGGCTCAG GTGCCCTTGC AGTCAACACA 
2820 

ACTGTAGATA GCTATAGAGT CAATGCCAAT GGTGAATGGG TAAACTAAAC TTAATATAAC 
2880 

TAGTTAATAC TGACTTCCTG TAAGAACTCT TTAAAGTATT CCCTACAAAT ACCATATCCT 2940 

TTCAGTAGAT AATATACCCT TGTAGGAAGT TTAGATTAAA AAATAACTCT GTAATCTCTA 3000 

GCCGGATTTA TAGCGCTAGA GACTACGGAG TTTTTTTGAT GAGGAAAGAA TGGCGGCATT 
306O 

CAAGAGACTC TTTAAGAGAG TTACGGGTTT TAAACTATTA AGCTTTCTCC AATTGCAAGA 
3120 

GGGCTTCAAT CTCTGCTAGG TGCTAGCTTG CGAAATGGCT CCCACGGAGT TTGGCGCGCC 
3180 

AGATGTTCCA CGGAGGTAGT GAGGAGCGAG GOCGCGGAAT TC 3222 
(2) INFORMATION FORSEQIDNO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 864 amino acids 

(B) TYPE: amino acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 

Phe Ala Ser Lys Ser Ghi Arg Lys Val His Tyr Ser lie Arg Lys Phe 
15 10 15 

Ser He Gly Val Ala Ser Val Ala Val Ala Ser Leu Phe Leu Gly Gly 
20 25 30 



168 



EP1477185A2 



ValVdHisAhGhiGlyValArgScrGlyAsnAfinLeu'nirValThr 
35 40 45 

Ser Ser Gry Gin Asp lie Ser Lys Lys Tyr Ala Asp Ghi Val Ghi Ser 
50 55 60 

His Leu Gta Ser He Leu Lys Asp Val Lys Lys Asn Ghi Lys Lys Val 
65 70 75 80 

Ala Ghi Ala Gin Lys Lys Val Ghi Ghi Ala Lys Lys Lys Ala Ghi Asp 
85 90 95 

Gb Lys Ghi Lys Asp Arg Arg Asn Tyr Pro Thr lie Thr Tyr Lys Thr 
100 105 110 

Leu Ghi Leu Ghi He Ala GhiSer Asp Val Ghi Val Lys Lys Ala Ghi 
115 120 125 

Leu Glu Leu Val Lys Val Lys Ala Lys Glu Ser Gta Asp Gin Glu Lys 
130 135 140 

lie Lys Ghi Ala Gto Ala Glu Val Ghi Ser Lys Gin Ala Glu Ala Tnr 
145 150 155 160 

Arg Leu Lys Lys De Lys Thr Asp Arg Glu Ghi Ala Lys Arg Lys Ala 
165 170 175 

Asp Ala Lys Leu Lys Glu Ala Val Glu Lys Asn Val Ah Thr Ser Ghi 
180 185 190 

Gin Asp Lys Pro Lys Arg Arg Ala Lys Arg Gfy Val Ser Gly Ghi Leu 
195 200 205 

Ala Tnr Pro Asp Lys Lys Ghi Asn Asp Ala Lys Ser Ser Asp Ser Ser 
210 215 220 

Val Gry Ghi Ghi Ifcr Leu Pro Ser Pro Ser Leu Asn Met Ala Asn Ghi 
225 230 235 240 

Ser Gta Thr Ghi His Arg Lys Asp Val Asp Ghi l>r lie Lys Lys Met 
245 250 255 

Leu Ser Ghi lie Gin Leu Asp Arg Arg Lys His Thr Ghi Asn Val Asn 
260 265 270 

Leu Asn De Lys Leu Ser Ala lie Lys Tnr Lys Tyi Leu Tyr Ghi Leu 
275 280 285 

Ser Val Leu Lys Ghi Asn Ser Lys Lys Glu Ghi Leu Tnr Ser Lys Tlir 
290 295 300 

Lys Ala Glu Leu Thr Ala Ala Ptae Glu Gta Phe Lys Lys Asp Thr Leu 
305 310 315 320 

Lys Pro Glu Lys Lys Val Ala Ghi Ala Glu Lys Lys Val Glu Ghi Ala 
325 330 335 

Lys Lys Lys Ala Lys Asp Gin Lys Glu Ghi Asp Arg Arg Asn Tyr Pro 
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340 345 350 

Thr Asn Thr Tyr Lys Thr Leu Ohi Leu Ghi lie Ala G hi Ser Asp Val 

355 360 365 

Lys Val Lys Ghi Ala Ghi Leu Gtn Leu Val Lys Ghi Ghi Ala Asn Gin 
370 375 380 

Ser Arg Asa Ghi Gla Lys He Lys Gin Ala Lys Glu Lys Val Ghi Ser 
385 390 395 400 

LysLys Ah Gui Ala Thr Arg Leu Glu Lys De Lys Thr Asp Arg Lys 
405 410 415 

Lys Ah Ghi Ghi Ghi Ala Lys Arg Lys Ala Ghi Ghi Ser Ghi Lys Lys 
420 425 430 

Ala Ala Ghi Ala Lys Gin Lys Val Asp Ala Ghi Ghi iyr Ala Leu Ghi 
435 440 445 

Ala Lys De Ala Ghi Leu Ghi T^r Ghi Val Ghi Arg Leu Glu Lys Glu 
450 455 460 

Leu Lys Ghi Ik Asp Ghi Ser Asp Ser Ghi Asp Tyr Leu Lys Ghi Gly 
465 470 475 480 

Leu Arg Ala Pro Leu Gin Ser Lys Leu Asp Tnr Lys Lys Ala Lys Leu 
485 490 495 

Ser Lys Leu Ghi Glu Leu Ser Asp Lys He Asp Glu Leu Asp Ala Ghi 
500 505 510 

He Ala Lys Leu Glu Val Gh Leu Lys Asp Ala Ghi Gly Asn Asn Asn 
515 520 525 

Val Glu Ala Tyr PheLys Glu Gly Leu Ghi Lys Thr Thr Ala Glu Lys 
530 535 540 

Lys Ala Ghi Leu Ghi Lys Ala Ghi Ah Asp Leu Lys Lys Ala Val Asp 
545 550 555 560 

Glu Pro Ghi Thr Pro Ala Pro Ala Pro Gin Pro Ate Pro Ala Pro Ghi 

565 570 575 

Lys Pro Ala Ghi Lys Pro Ala Pro Ala Pro Glu Lys Pro Ala Pro Ala 
580 585 590 

Pro Ghi Lys Pro Ala Plro Ala Pro Ghi Lys Pro Ala Pro Ala Pro Glu 
595 600 605 

Lys Pro Ala Pro Ala Pro Glu Lys Pro Ab Pro Thr Pro Ghi Thr Pro 
610 615 620 

Lys Thr Gly Tip Lys Ghi Glu Asn Gly Met Trp Tyr Phe Tyr Asn Thr 
625 630 635 640 

Asp Gly Ser Met Ala Thr Gly Trp Leu Gin Asn Asn Gly Ser Trp Tyr 
645 650 655 
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TyrLeu AsnSerAsnGlyAlaMet AJaThrOlyTYpLeuGlnAsnAsn 
660 665 670 

Gly Ser Tip Tyr Tyr Leu Asn Ser Asn Gly Ala Met Ala Thr Gly Tip 
673 680 685 

Leo Gta Tyr Asn Gly Ser Trp Tyr Tyr Leu Asn Ala Asn Gly Asp Met 
690 695 700 

AtoThrOlyTTpI^GbTyAsiiGJySerT^TyrTyrLeuAsnAlfl 
705 710 715 720 

Asn Gfy Asp Met Ala Thr Gly Ttp Phe Gta Tyr Asn Gly Ser Tip Tyr 
725 730 735 

Tyr Leu Asn Ala Asn Gly Asp Met Ala Thar Gfy Tip Phe Gta Tyr Asn 
740 745 750 

Gfy Ser Trp T^r Tyr Leu Asn Ala Asn Gfy Asp Met Ala Thr Gly Tip 
755 760 765 

Leu Gin Tyr Asn Qfy Ser Trp Tyr Tyr Leu Asn Ser Asa Gly Ala Met 
770 775 780 

Val Thr Gly Trp Leu Gb Asn Asn Gly Ser Trp Tyr Tyr Leu Asn Ala 
785 790 795 800 

Asn Gfy Ser Met Ala Hit Asp Trp Val Lys Asp Gfy Asp Tbr Trp Tyr 
805 810 815 

Tyr Leu Gb Ala Scr Gfy Ala Met Lys Ala Ser Gb Trp Phe Lys Val 
820 825 830 

Ser Asp Lys Trp Tyr Tyr Val Asn Gly Ser Gfy Ala Leu Ab Val Asn 
835 840 845 

Thr Ihr Val Asp Ser Tyr Arg Val Asn Ab Asn Gfy Glu Trp Val Asn 
850 855 860 



(2) INFORMATION FOR SEQ ID NO:41 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1231 ammo acids 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(if) MOLECULE TYPE: ammo acid 



(xO SEQUENCE DESCRIPTION: SEQ IDNO:41: 

Ser Asp Ser Ser Val Gry Gb Gb Thr Leu Pro Ser Pro Ser Leu Asn 
15 10 15 

Met Ala Asn Gb Ser Gb Thr Gb His Arg Lys Asp Val Asp Gta Tyr 
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20 25 30 

De Lys Lys Met Leo Set Gto He Gto Lea Asp Arg Arg Lys His Tin- 
35 40 45 

Gin Asn Gto Gin Sex Pro Val Ala Ser Gin Ser Lys Ala Ghi Lys Asp 
50 55 60 

Tyr Asp AJa Ala Lys Lys Asp Ala Lys Asa Ala Lys Lys Ala Val Glu 
65 70 75 80 

Asp Ala Gin Lys Ala Lea Asp Asp Ala Lys Ala Ala Gin Lys Lys Tyr 
85 90 95 

Asp Glu Asp Val Asn Leu Asa He Lys Leu Ser Ala He Lys Thr Lys 
100 105 110 

iy Leu T^r Ghi Leu Ser Val Leu Lys Glu Asn Ser Lys Lys Gin Ghi 
115 120 125 

Leu Tnr Ser Lys Thr Lys Ala Ghi Leu Tnr Ala Ala Phe Ghi GlnPhe 
130 135 140 

Lys Lys Asp Thr Leu Oto Lys Lys Tnr Gto Gto Lys Ala Ala Leu Gto 
145 150 155 160 

Lys Ala Ala Ser Ghi Ghi Met Asp Lys Ala Val Ala Ala Val Ghi Gto 
165 170 175 

Ala Tyr Leu Ala Tyr Gin Gin Ala Thr Asp Lys Pro Ghi Lys Lys Val 
180 185 190 

Ala Glu Ala Ghi Lys Lys Val Glu Glu Ala Lys Lys Lys Ala Lys Asp 
195 200 20S 

Gin Lys Ghi Glu Asp Arg Arg Ash iy Pro Hit Asn Thr Tyr Lys Hir 
210 215 220 

Leu Gto Leu Gto De Ala Gto Ser Asp Val Lys Val Lys Ala Ala Lys 
225 230 235 240 

Asp Ala Ala Asp Lys Met He Asp Gto Ah Lys Lys Arg Gto Ghi Gto 
245 250 255 

Ala Lys Thr Lys Pbe Asn Thr Val Arg Ala Met Val Val Lys Glu Ala 
260 265 270 

Gto Leu Gto Leu Val Lys Gto Gto Ala Asn Gto Ser Arg Asn Glu Gto 
275 280 285 

Lys He Lys Gto Ala Lys Gto Lys Val Gto Ser Lys Lys Ala Gto Ala 
290 295 300 

Thr Arg Leu Gto Lys fle Lys Thr Asp Arg Lys Lys Ala Glu Gto Pro 
305 310 315 320 

Gto Pro Gto Gto Leu AJa Ghi Hir Lys Lys Lys Ser Gto Gto Ala Lys 
325 330 335 
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Gh Lys A h Pro Glu Leu Thr Lys Lys Leu Glu Glu Ah Lys Arg Lys 
340 345 350 

Ala Glu Glu Ser Gh Lys Lys Ala Ala Glu Ala Lys Gin Lys Val Asp 
355 360 365 

Ala Gla Gla Tyr Ala Leu Glu Ala Lys De Ala Gin Leu Ghi Tyr Glu 
370 375 380 

Val Gh Arg Leu Gh Lys Glu Leu Lys Gh lie Asp Ghi Gh Ah Lys 
385 390 395 400 

Ala Lys Leu Glu Glu Ala Glu Lys Ly3 Ala Thr Glu Ala Lys G hi Lys 
405 410 415 

Val Asp Ala Glu Ghi Val Ala Pro Gin Ala Lys lie Ala Ghi Lea Glu 
420 425 430 

Asn Gla Val His Arg Leu Glu Gh Glu Leu Lys Glo Tie Asp Gla Ser 
435 440 445 

Asp Ser GJu Asp Tyr Leu'Lys Ghi Gly Leu Arg Ala Pro Leu Gh Ser 
450 455 460 

Lys Leu Asp Thr Lys Lys Ah Lys Leu Ser Lys Leu Gto Ghi Leu Ser 
465 470 475 480 

Asp Lys De Asp Gh Leu Asp Ah Ghi Se Ah Lys Leu Ghi Val Gin 
485 490 495 

Leu Ser Ghi Ser Gh Asp Tyr Ala Lys Glu Gly Phe Arg Ala Pro Leu 
500 505 510 

Gh Ser Lys Leu Asp Ah Lys Lys Ah Lys Leu Ser Lys Leu Glu Glu 
515 520 525 

Leu Ser Asp Lys lie Asp Glu Leu Asp Ah Gh De Ah Lys Leu Glu 
530 535 540 

Asp Gh Leu Lys Asp Ah Gh Gly Asn Asn Asn Val Ghi Ah Tyr Phe 
545 550 555 560 

Lys Ghi Gly Leu Gh Lys Thr Thr Ah Glu Lys Lys Ah G!u Leu Oh 
565 570 575 

Lys Ah Glu Ah Asp Leu Lys Lys Ah Val Asp Gh Pro Gh Thr Pro 
580 585 590 

Ala Pro Ala Pro Gh Lys Ah Ah Gh Gh Asn Asn Asn Val Gh Asp 
595 600 605 

Tyr Phe Lys Ghi (Hy Leu Gh Lys Thr lie Ah Ah Lys Lys Ah Glu 
610 615 620 

Leu Glu Lys Thr Gh Ah Asp Leu Lys Lys Ah Val Asn Gh Pro Gh 
625 630 635 640 

Lys Pro Ah Pro Ah Pro Glu Pro Ah Pro Ah Pro Gh Lys Pro Ah 
645 650 655 
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Glo Lys Pro Ala Pro Ala Pro Gto Lys Pro Ala Pro Ala Pro Ghi Lys 
660 665 670 

Pto Ala Pro Ala Pro Ghi Lys Pro Ah Pro Ala Thr Pto Ala Pro Ghi 
675 680 685 

Ala Pro Ala Ghi Gin Pro Lys Pro Ala Pro Ala Pro Ghi Pro Ala Pro 
690 695 700 

Ala Pro Lys Pro Glu Lys Pro Ala Ghi Gin Pro Lys Pro Ghi Lys Ttar 
705 710 715 720 

Asp Asp Gin Gin Ala Ghi Glu Asp Tyr Ala Arg Arg Pro Ghi Lys Pro 
725 730 735 

AlaProAbProGhiLysProAlaProThrProGluThrl^LysThr 
740 745 750 

Gry Trp Lys Gin Ghi Asn Gly Met Tip Tyr Phe Tyr Asa Tot Asp Gry 
755 760 765 

Ser Met Ala Thr Gry Trp Ser Glu Ghi Glu Tyr Asn Arg Leu Thr Gin 
770 775 780 

Gto Gh Pro Pro Lys Ah Ghi Lys Pro Ala Pro Ala Pro Lys Thr Gry 
785 790 795 800 

TrpLysGhiGhAsaGlyMetTtpTyrPiwTyrAsnThrAspGlyScr 
805 810 815 

Leu Gin Asn Asn Gry Ser Trp Tyr Tyr Leu Asn Ser Asn Gly Ala Met 
820 825 830 

Ala Thr Gly Trp Leu Gin Asn Asn Gly Ser Trp Tyr Tyr Leu Asn Ser 
835 840 845 

Asn Gly Ala Met Ala Thr Gly Trp Leu Gto Tyr Asn Gly Ser Trp Tyr 
850 855 860 

Tyr Leo Met Ala Thr Gry Trp Leu Gto Asn Asn Gry Ser Trp Tyr Tyr 
865 870 875 880 

Leu Asn Ser Asn Gfy Ala Met Ala Thr Gry Trp Leu Gto Tyr Asn Gly 
885 890 895 

Ser Trp Tyr Tyr Leu Asn Ala Asn Gly Asp Met Ala Thr G ly T*p Leu 
900 905 910 

Gin Tyr Asn Gly Ser Trp Tyr Tyr Leu Asn AU Asn Gly Asp Met Ala 
915 920 925 

Thr Gry Trp Phe Gto Tyr Asn Gry Ser Trp Tyr Tyr Leu Asn Ala Asn 
930 935 940 

Gry Asp Met Ala Thr Gry Trp Asn Ala Asn Gry Ala Met Ala Thr Gfy 
945 950 955 960 

Trp Ala Lys Val Asn Gly Ser Trp Tyr Tyr Leu Asn Ala Asn Gry Ala 
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965 970 975 

Met Ala Thr Gry Trp Leu Gin Tyr Asn Gry Ser Trp Tyr Tyr Leu Asn 
980 985 990 

Ala Asn Gry Ala Met Ala Thr Gly Tip Phe Gin Tyr Am Gly Ser Tip 
995 1000 1005 

Tyr Tyr Leu Asn Ala Asn Gly Agp Met Ala Thr Gly Tip Leo GKn Tyr 
1010 1015 1020 

Asn GJy Ser Trp Tyr Tyr Leu Asn Ser Asn Gly Ala Met Val Thr GJy 
1025 1030 1035 1040 

Trp Leu Gin Asn Asn Gry Ser Trp Tyr Tyr Leu Ala Lys Val Asn Gry 
1045 1050 1055 

Ser Trp Tyr Tyr Leu Asn Ala Asn Gry Ala Met Ala Thr Gty Trp Leu 
1060 1065 1070 

Gin Tyr Asa Gly Ser Trp Tyr Tyt Leu Asn Ala Asn Gry Ala Met Ala 
1075 1080 1085 

Thr Gry Trp Ala Lys Val Asn Gly Ser Trp Tyr Tyr Leu Asn Ala Asn 
1090 1095 1100 

Gly Ser Met Ala Thr Asp Trp Val Lys Asp Gry Asp Thr Trp Tyr Tyr 
1105 1110 1115 1120 

Leu Gin Ala Ser Gry Ala Met Lys Ala Ser Gin Trp Phe Lys Val Ser 
1125 1130 1135 

Asp Lys Trp Tyr Tyr Val Asn Gry Ser Gry Ala Leu Ala Val Asn Asn 
1140 1145 1150 

Ala Asn Gly Ala Met Ala Thr Gry Trp Val Lys Asp Gry Asp % 
U55 1160* 1165 

Tyr Tyr Leu GIu Ala Ser Gry Ala Met Lys Ala Ser Gin Trp Phe Lys 
1170 1175 1180 

Val Ser Asp Lys Trp Tyr Tyr Val Asn Gry Leu Gry Ala Leu Ala Val 
1185 1190 1195 1200 

Asn Thr Thr Val Asp Ser lyrArg Val Asn Ala Asn Gry Ghi Trp Val 
1205 1210 1215 

Thr Thr Val Asp Gry TVr Lys Val Asn Ala Asn Gry Ghi Trp Val 
1220 1225 1230 

(2) INFORMATION FOR SEQ ID NO:42: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 588 amino adds 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

fii) MOLECULE TYPE: amino acid 
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(m) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Glu Gly VaJ Arg Ser Gly Asn Asa Leu Thr Val Thr Ser Ser GJy Gin 
1 5 10 15 

Asp lie Ser Lys Lys Tyr Ah Asp Ghi Val Ghi Ser His Leu Ghi Ser 
20 25 30 

Ite Leu Lys Asp VaJ Lys Lys Asn Leu Lys Lys Val Gin His Tbr Gin 
35 40 45 

Asn VaJ Gly Lea lie Thr Lys Leu Ser Ghi De Lys Lys Lys Tyr Leu 
50 55 60 

Tyr Asp Leu Lys Vol Asn Val Leu Ser Glu Ala Ghi Leu Thr Ser Lys 
65 70 75 80 

Tnr Lys Glu Thr Lys Glu Lys Leu Thr Ala Thr Pbe Ghi Gin Phe Lys 
85 90 95 

Lys Asp Thr Leu Pro Thr Glu Pto Ghi Lys Lys Val Ala Glu Ah Gin 
100 105 UO 

Lys Lys Val Glu Gin Ah Lys Lys Lys Ala Glu Asp Ghi Lys Glu Lys 
115 120 125 

Asp Arg Arg Asn Tyr Pro Thr De Thr Tyr Lys Thr Leu Ghi Leu Glu 
130 135 140 

He Ala Ghi Ser Asp Val Ghi Val Lys Lys Ala Glu Leo Glu Leu Val 
145 150 155 160 

Lys Val Lys Ala Lys Ghi Ser Gin Asp Ghi Ghi Lys He Lys Ghi Ah 
165 170 175 

Ghi Ah Ghi Val Ghi Ser Lys Gto Ah Ghi Ah Hit Arg Leu Lys Lys 
180 185 190 

lie Lys Thr Asp Arg Ghi Ghi Ala Lys Arg Lys Ala Asp Ala Lys Leu 
195 200 205 

Lys Ghi Ah Val Ghi Lys Asn Val Ah Thr Ser Glu Gto Asp Lys Pro 
210 215 220 

Lys Arg Arg Ah Lys Arg Gly Val Ser Gry Glu Leu Ah Thr Pro Asp 
225 230 235 240 

Lys Lys Ghi Asn Asp Ala Lys Ser Ser Asp Ser Ser Val Gly Gh Thr 
245 250 255 

Leu Pro Ser Pro Ser Leu Asn Met Ah Asn Glu Ser Gin Thr Gh His 
260 265 270 

Arg Lys Asp Val Asp Glu Tyr lie Lys Lys Met Leu Ser Ghi lie Gin 
275 280 285 
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Leu Asp Arg Arg Lys His Thr Gin Asn Val Asn Leu Asn Oe Lys Leu 
290 295 300 

Ser Ala lie Lys Thr Lys Tyr Leu Tyr Ghi Leu Ser Val Leu Lys Glu 
30S 310 315 320 

Asn Ser Lys Lys Glu Ghi Leu Thr Ser Lys Thr Lys Ala Glu Lea Thr 

325 330 335 

Ala Ala Phe Glu Gin Pbe Lys Lys Asp Thr Leu Lys Pro Ghi Lys Lys 
340 345 350 

Val Ala Ghi Ala Ghi Lys Lys Val Glu Ghi Ala Lys Lys Lys Ala Lys 
355 360 365 

Asp Glu Lys Ghi Ghi Asp Arg Arg Asn Tyr Pro Thr Asn Thr Tyr Lys 
370 375 380 

Thr Leu Gbi Leu Ghi lie Ala Glu Ser Asp Val Lys Val Lys GId Ala 
385 390 395 400 

Ghi Leu Ghi Leu Val Lys Ghi Ghi Ala Asn Glu Ser Arg Asn Ghi Ghi 
405 410 415 

Lys De Lys Gb AJa Lys Ghi Lys Val Ghi Ser Lys Lys Ala Ghi Ala 
420 425 430 

Thr Arg Leu Ghi Lys De Lys Thr Asp Arg Lys Lys Ate Ghi Ghi Glu 
435 440 445 

Ala Lys Arg Lys Ala Ghi Ghi Ser Ghi Lys Lys Ala Ala Ghi Ah Lys 
450 455 460 

Gin Lys Val Asp Ala Glu Glu Tyr Ala Leu Ghi Ala Lys lie Ala Glu 
465 470 475 480 

Leu Ghi Tyr Ghi Val Ghi Arg Leu Leu Lys Glu Leu Lys Glu He Asp 
485 490 495 

Ghi Ser Asp Ser Ghi Asp Tyr Leu Lys Ghi Gry Leu Arg Ala Pro Leu 
500 505 510 

Gin Ser Lys Leu Asp Hit Lys Lys Ah Lys Leu Ser Lys Leu Ghi Ghi 
515 520 525 

Leu Ser Asp Lys He Asp Glu Leu Asp Ala Glu lie Ala Lys Leu Glu 
530 535 540 

Val Gin Leu Lys Asp Ala Ghi Gry Asn Asn Asn Val Glu Ala Tyr Phe 
545 550 555 560 

Lys Ghi Gly Leu Ghi Lys Thr Thr Ala Ghi Lys Lys Ala Glu Leu Glu 
565 570 575 

Lys Ala Ghi Ala Asp Leu Lys Lys Ala Val Asp Ghi 
580 585 

(2) INFORMATION FOR SEQ ID NO:43: 
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0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic add 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DN A (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N043: 

CCAAGCTATT AGGTGACACT ATAGAATACT CAAGCTATGC ATCAAGCTTA TGCTTGTCAA 60 

TAATCACAAA TATGTAGATC ATATCTTGTT TAGGACAGTA AAACATCCTA ATTACTTTTT 120 

AAATATTCTT CCTGAGTTGA TTGGCTTGAC CTPGTTOAGT CATGCTTATG TOACTTTTGT 180 

TTTAGTTTTT CCAGTTTATG CAGTTATTTT GTATCOACOA ATAGCTGAAG AGGAAAAGCT 240 

ATTACATGAA GTTATAATCC CAAATGGAAG CATAAAGAGA TAAATACAAA ATTCGATTTA 
300 

TATACAGTTC ATATTGAAGT AATATAGTAA GGTTAAAGAA AAAATATAGA AGGAAATAAA 
360 

CATGTTTGCA TCAAAAAGCG AAAGAAAAGT ACATTATTCA ATTCGTAAAT TTAGTATTGG 420 

AGTACTAGTO TAGCTGTTGC CAGTCTTGTT ATGGGAAGTG TGGTTCATGC ACCAGAAAAC 480 

GAGGAAGTAC CCAAGCAGCC CTTCTTCTAA TATGGCAAAG ACAGAACATA GG AAAGCGCT 
540 

AAACAGTCGT CGATGAATAT ATAGAAAAAA TGTTGAGGGA GATTCAACTA GATAGAAG AA 
600 

AACATACCCA AAATGTCGCC TTAAACATAA AGTTGAGCGC AATTAAACGA AGTATTTGCG 
660 

TGAATTAATG TTTAGAAGAG AAGTCGAAAT G AGTTGCCGT CAGAAATAAA AGOGAAGTTA 
720 

GACGCCGCTT TTGAAAGTTT AAAAAAGATA CATTGAAACC AGGAGAAAAG GTAGCGAAGC 
780 

TAAGAAGAAG TTGAAGAAGC TAAGAAAAAG CCAGGATCAA AAAGAAGAAG ATCGCGTAAC 
840 

TACCCAACCA ATACTTCAAA ACGCTTGACC TTOAAATTGC TOAGTCGATG TGAAAGTTAA 900 

AGAAGCGGAG CTTG AACTAG TAAAGAGGAA GCTGAACTCG AG ACG AGGAA AAAATTAAGC 
960 

AAGCAAAAGC GAAAGTTGAG AGTAAAAAAG CTGAGGCTAC AAGGTTAGAA AACATCAAGA 
1020 
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CAGATCTAAA AAAGCAGAAG AAGAAGTAAA CGAAAAGCAG CAGAAGAAGA TAAAGTTAAA 
1080 

GAAAAACCAG CTGAACAACC ACAACCAGCG CCGGTACTCA ACCAGAAAAA CCAGCTCCAA 
1140 

AACCAGAGAA GCCAGCTGAA CAACCAAAAG CAGAAAAAAC AGATG ATCAA CAAGCTGAAG 
1200 

AAGACTATGC TCGTAGATCA GAAGAAGAAT ATAATCGCTT OATCAACAGC AACCGCCAAA 
1260 

AACTGAAAAA OCAGCACAAC CATTACTCCA AAAACA 1296 
(2) INFORMATION FOR SEQ H>NO:44: 

(i) SEQUKNCB CHARACTERISTICS: 

(A) LENGTH: 908 amino acids 

(B) TYPE; amino acid 

(Q STRANDEDNESS: single . 
(D)TOPOIjOGY: linear 

00 MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Ah Ala Ala Ala Ala Gty Cys Thr Ala Ala Ala Cy$ Thr Ala Thr Cys 
1 5 10 15 

Ala Ala Ala Ala Cys Thr Thr Gry Ala Ala Gly Ala Gry Thr Thr Ala 
20 25 30 

AJa Gly Thr Gly Ala Tlir Ala Ala Oty Ala Thr Thr Gly Ala Thr Gly 
35 40 45 

Ah Gry Ah Ah Ah Ah Gry C^TTir Thr Gry Ah <^ Cys ^ 
50 55 60 

Thr Gly Ala Ala Ala Thr Hit Gry Cys Thr Gry Ala Gly Thr Tyr Cys 
65 70 75 80 

Gry Ala Thr Gly Thr Gly Ala Ala Ala Gry Thr Thr Ala Ah Ala Gry 
85 90 95 

Ah Ah Thr Thr Ah Gry Ah Cys Gry Cys Thr Gry Ah Ala Ala Thr 
100 105 110 

Thr Gty Cys Ah Ah Ah Ah Cys Thr Thr Gry Ah Ah Gty Ah Thr 
115 120 125 

Cys Ah Ah Cys Thr Thr Ah Ala Ah Gry Cys Thr Gry Cys Thr Gry 
130 135 140 

Ah Ah Gry Ah Gry Cys Gry Gry Ah Gry Cys Thr Thr Gry Ah Ala 
145 150 155 160 
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Cys TTir Ah Gly Thr Ah Ah Ah Arg Gly Ala Gly Gly Ala Ah Gly 
165 170 175 

Cys Thr Met Met Arg Gly Ala Ala Tyr Cys Thr Cys Gly Ala Gly Ala 
180 185 190 

Cys Gly Ala Gly Gly Ala Ah Ala Ala Cys Ah Ala Thr Ala Ala Thr 
195 200 205 

Gly Thr Ala Gly Ala Ala Gly Ala Cys Thr Ah Cys Thr Hit Ttnr Ala 
210 215 220 

Ala Ala Gly Ala Ala Gly Gly Thr Thr Thr Ala Gly Ala Gly Ala Ala 
225 230 235 240 

Ala Ala Cys Thr Ala Thr Thr Gry Ala Ala Ala Ala Ala Hit Thr Ala 
245 250 255 

Ala Gly Cys Ala Ah Gly Cys Ala Ala Ala Ala Gly Cys Gly Ala Ala 
260 265 270 

Ala Gry Hr Tlir Gry Ala Gry Ala Gly Cys Thr Gry Cys Thr Ala Ala 
275 280 285 

Ala Ala Ala Ala Gly Cys ThrGly Ala Ala Thr Thr Ala Gly Ala Ala 
290 295 300 

Ala Ala Ala Ala Cys Tlir Gry Ala Ala Gly Cys Thr Gry Ala Cys Cys 
305 310 315 320 

Thr Thr Thr Ala Ala Ala Ala Ala Ala Gry Cys Thr Gry Ala Gry Gry 

325 330 335 

Cys Thr Ala Pys Ala Ala Gry Gry Thr Thr Ala Gly Ala Ala Ala Ala 
340 345 350 

Cys Ala Thr Cys Ala Ala Gly Ala Cys Ala Gly Ala Tin* Asn Gly Thr 
355 360 365 

Ala Ah Gly Ah Ah Ah Gry Cys Ah Gly Thr Tlir Ala Ah Thr Gry 
370 375 380 

Ala Gly Cys Cys Ala Gry Ala Ala Ala Ala Ala Cys Cys Ala Gly Cys 
385 390 395 400 

Thr Cys Pys Ala Gry Cys Thr Cys Cys Ala Gry Ah Ala Ala Cys Thr 
405 410 415 

Cys Cys Ah Al a Ah Ah Ah Ah Gry Cys Ah Gry Ah Ah Gry Ah 
420 425 430 

Ah Gry Ah Ah Gry Asn Thr Ah Ala Ah Cys Gry Ala Ah Ah Ah 
435 440 445 

Gry Cys Ah Gly Cys Ah Gly Ala Ah Gry Ah Ah Gly Ah Thr Ah 
450 455 460 

Ah Ah Gry Cys Cys Cys Cys Ala Gly Ah Ala Gry Cys Ala Cys Cys 
465 470 475 480 
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Ala Gry Qys Thr Gly AJa Ala Qys Ala Ala Cys Qys Ala Ala Ala Ala 
485 490 455 

Cys Cys Ala Gry Cy» Gly Cys Cys Gly Gly Cys Thr Cys Cys Thr Cys 
500 505 510 

Ate Ala Cys Ala Gry Thr Thr Ala Ala Ala Gly Ala Ala Ala Ala Ala 
515 520 525 

Cys Cys Ala Gry Cys Thr Gly Ala Ala Cys Ala Ala Cys Cys Ala qys 
530 535 540 

Ala Ala Cys Cys Ala Gry Cys Ory Cys Qys Gly Gly Asa Thr Ala Cys 
545 550 555 560 

Thr Cys Ala Ala Cys Cys Ala Gry Cys Thr Cys Cys Cys Gly Cys Ala 

565 570 575 

Cys Cys Ala Ala Ala Ala Cys Cys Ala Gly Ala Gly Ala Ala Gry Cys 
580 5B5 590 

Cys Ala Gry Cys Thr CHy Ala Ala qys Ala Ala Cys Cys Ala Ala Ala 
595 600 605 

Ala Cys Cys Ala QysAlaGly Ala Ala Ala Ala Ala Cys Qys Ala Gry 
610 615 620 

Cys Thr Cys Cys Ala Ala Ala Ala Cys Cys Ala Gly Ala Gry Ala Ala 
625 630 635 640 

GW Cys Cys Ala Gry Cys Tht Gry Ala Ala Cys Ala Ala Cys Cys Ala 
645 650 655 

Ala Ala Ala Gly Cys Ala Gly Ala Ala Ala Ala Ala Ala Cys Ala Gly 
660 665 670 

Ala Thr Gry Ala Hit Cys Ala Ala Qys Ala Ala Gly Cys Hit Gly Ala 
675 680 685 

Ala Gty Ala Ala Gry Ala Cys Thr Ala Thr Gly Qys Thr Cys Gry Hit 
690 695 700 

Ala Gly AJa Thr Cys AJa Gry Ala Gly Ala Ala Ala Ala Ala Ala Cys 
705 710 715 720 

Ala Gry Ala Tlir Gry Ala Thr Cys Ala Ala Cys Ala Ala Gly Cys Thr 
725 730 735 

Gry Ala Ala Gly Ala Ala Gry Ala Cys Thr Ala Thr Gly Cys Thr Cys 
740 745 750 

Gly Thr Ala Gly Ala Thr Cys Ala Gry Ala Ala Gry Ala Ala Gly Ala 
755 760 765 

Ala Thr AJa Thr Ala Ala Tar Cys Gry Cys Thr Thr Gly Ala Cys Thr 
770 775 780 

Cys Ala Ala Cys Ala Gly Cys Ala Ala Cys Cys Gly Cys Cys Ala Ala 
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785 790 795 800 

AlaAlaGlyCysThrOlyAlaAlaAlaAlaAlaCysAlaOJyAlaAla 
805 810 815 

GlyAlaAlaThrAIaThrAhAlaThrC^GIyCysThrThrGlyAh 
820 825 830 

A^ThrCysAJaAlaCysAlaGlyCysAlaAlaCysCysGlyCysCys 
835 840 845 

Ala Ala Ate AkAk Cys Thr Gly Ala Ala Ala Ala Afc Cys C^ Ala 
850 855 860 

Gly Cys Thr Cys Cys Thr Gly Cys Ala Cys Cys Ala Ala Ala Ala Ala 
865 870 875 880 

Cys Ala Qys Ala Gly Cys Ala Qys Ala Ala Cys Cys Ala Thr Asa Thr 
885 890 895 

Ala Cys Tfcr Cys Cys Ala Ala Ala Ala Ala Cys Ala 
900 905 

(2) INFORMATION FOR SEQ ID NO:45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2059 base pairs 

(B) TYPE: nucleic acid 

(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

Cu) MOLECULE TYPE: DNA (genomic) 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

AAGCTTATGC TTGTCAATAA TCACAAATAT GTAGATCATA TCTTGTTTAG AAGCTTATGC 60 

TTGTCAATAA TCACAAATAT GTAGATCATA TCTTGTTTAG GACAGTAAAA CATCCTAATT 120 

ACTTTTTAAA TATTTTACCT GAGTTGATTG GACAGTAAAA CATCCTAATT ACTTTTTAAA 180 

TATTCTTCCT GAGTTGATTG GCTTGACCTT GTTGAGTCAT GCCTATATGA CITTTGTTTT 240 

AGTTTTTCCA GCTTGACCTT GTTGAGTCAT GCTTATGTGA CTTTTGTTTT AGTTTTTCCA 300 

GTTTATGCAG TTATTTTGTA TCGACGAATA GCTGAAGAGG AAAAGTTATT GTTTATGCAG 360 

TTATTTTGTA TCGACGAATA GCTGAAGAGG AAAAGCTATT ACATGAAGTT ATAATCCCAA 420 

ATGGAAGCAT AAAGAGATAA ATACAAAATT ACATGAAGTT ATAATCCCAA ATGGAAGCAT 
480 

AAAGAGATAA ATACAAAATT CGATTTATAT ACAGTTCATA TTGAAGTGAT ATAGTAAGGT 
540 
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TAAAGAAAAA CGATTTATAT ACACTTCATA TTGAAGTAAT ATAGTAAGGT TAAAGAAAAA 
600 

ATATAGAAGG AAATAAACAT GTTTGCATCA AAAAGCGAAA GAAAAGTACA ATATAGAAGG 
660 

AAATAAACAT GTTTGCATCA AAAAGCGAAA GAAAAGTACA TTATTCAATT CGTAAATTTA 
720 

GTATTGGAGT AGCTAGTGTA GCTGTTGCCA TTATTCAATT CGTAAATTTA GTATTGGAGT 780 

ACTAGTGTAG CTGTTGCCAG CTTOTTCTTA GGAGGAGTAG TCCATGCAGA AGGGGTTAGA 
840 

AGTGGGAATG TCTT G TTATG GGAAGTGTGG TTCATGCACC AGAAAACGAG GAAGAACCTC 
900 

ACGGTTACAT CTAGTGGGCA AGATATATCG AAGAAGTATG TACCCAAGCA GCCCTTCTTC 960 

TAATATGGCA AAGACAGAAC ATAGG AAAGC TGATG AAGTC GAGTCGCATC TAG AAAGTAT 
1020 

ATTGAAGGAT GTCCGCTAAA CAOTCGTCOA TGAATATATA GAAAAAATGT TG AGGGAGAT 
1080 

TAAAAAAAAT TTGAAAAAAG TTCAACATAC CCAAAATGTC GGCTTAATTA CCAACTAGAT 
1140 

AGAAGAAAAC ATACCCAAAA TGTCGCCTTA AACATAAAGT TGAGCGAAAT TAAAAAGAAG 
1200 

TATTTGTATG ACTTAAAAGT TAAAAGTTGA GCGCAATT AA ACGAAGTATT TGCGTOAATT 
1260 

AATGTTTAGA TOTTITATCG GAAGCTGAGT TGACGTCAAA AACAAAAGAA ACAAAAGAAA 
1320 

AGAG AAGTCG AAATGAGTTG CCGTCAGAAA TAAAAGCGAA GTTAACCGCA ACTTTTGAGC 
1380 

AGTTTAAAAA AGATACATTA CCAACAGAAA GTTAGACGCC GCTTTTGAAA GTTTAAAAAA 
1440 

GATACATTGA AACCAGAAAA AAAGGTAGCA GAAGCTCAGA AGAAGGTTGA AGAAGCTAAG 
1500 

AACCAGGAGA AAAGGTAGOG AAGCTAAGAA G AAGTTGAAG AAGCTAAGAA AAAAGCCGAG 
1560 

GATCAAAAAG AAAAAG ATCG CCGTAACTAC CCAACCATTA AAAGCCAGG A TCAAAAAGAA 
1620 

GAAG ATCGCG TAACTACCC A ACCAATACTT ACAAAACGCT TGAACTTGAA ATTGCTGAGT 
1680 

CCGATGTGGA AGTTAAACTT CAAAACGCTT GACCTTGAAA TTGCTGAGTC GATGTGAAAG 
1740 
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TTAAAAAAGC OGAGCTTGAA CTAGTAAAAG TGAAAGCTAA GGAATCTCAA GACGAGAAGC 
1800 

OGAGCTTGAA CTAGTAAAGA GGAAGCTGAA CTOGAGAOGA GGAAAAAATT AAGCAAGCAG 
1860 

AAGCGG AAGT TGAGAGTAAA CAAGCTGAGA GGAAAAAATT AAGCAAGCAA AAGOGAAAGT 
1920 

TGAGAGTAAA AAAGCTGAGG CTACAAGGTT AAAAAAAATC AAGACAGATC GTGAAGAGCT 
1980 

ACAAGGTTAG AAAACATCAA GACAGATGTA AAAAAGCAGA AGAAG AAGCT AAACGAAAAG 
2040 

CAG AGTAAAC GAAAAGCAG 2059 
(2) INFORMATION FOR SEQ ID NO:46: 

G) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 605 amino acids 

(B) TYPE: amino add 

(Q STRANDEDNBSS; single 
(D) TOPOLOGY: Imear 

(ii) MOLECULE TYPE: amino acid 



(xi*) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

SerGInThrGhiHis ArgLysAsp ValAspGluTyrllcLysLysMet 
15 10 15 

Leu Ser Glu lie Gin Leu Asp Aig Arg Lys His Thr Gin Asa Val Asa 
20 25 30 

Leu Asn He Lys Led Ser Ala fle Lys Thr Lys Tyr Leu Tyr Ala Lys 
35 40 45 

Thr Ghi His Arg Lys Ato Ala Lys Xaa Val Val Asp Ghi Ty lie Ghi 
50 55 60 

Lys Met Leu Arg Ghi He Gm Leu Asp Arg Arg Lys HBs Thr Gin Asn 
65 70 75 80 

Val Ala Leu Asn He Lys Leu Ser Ala lie Xaa Thr Lys Tyr Leu Arg 
85 90 95 

Glu Leu Ser Val Leu Lys Ghi Am Ser Lys Lys Glu Ghi Leu Thr Ser 
100 105 no 

Lys Thr Lys Ala Glu Leu Thr Ala Ala Phe Ghi Gin Phe Lys Lys Asp 
115 120 125 

Thr Leu Lys Pro Ghi Lys Lys Val Ala Gra Ala Ghi Lys Lys Val Glu 
130 135 140 
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Gb Ala Ghi Leo Xaa Val Xaa Gb Ghi Lys Ser Xaa Xaa Glu Leu Pro 
145 15D 155 160 

Ser Gb lie Lys Ah Lys Leu Asp Ala Ala Phe Xaa Lys Fbe Lys Lys 
165 170 175 

Asp Thr Leu Lys Pro Gry Gb Lys Val AbGbAb Lys Lys Xaa Vai 
180 185 190 

Gb Gb Ala Lys Lys Lys Ala Lys Asp Gb Lys Gb Gb Asp Arg Arg 
195 200 205 

Asa Tyt Pro Thr Asn Thr Tyr Lys Thr Leu Gb Leu Gb lie Ala Glu 
210 215 220 

Ser Asp Val Lys Val Lys Gb Ab Gb Leu Gb Leu Val Lys Gb Gb 
225 230 235 240 

Ala Asn Gb Ser Arg Lys Xaa Lys Ala Xaa Asp Gb Lys Gb Gb Asp 
245 250 255 

Arg Arg Asn Tyt Pro Thr Asn Tto Xaa Lys Hit Leu Asp Leu Gb Dc 
260 265 270 

Ala Gb Xaa Asp Val Lys Val Lys Gb Ala Gb Leu Gb Leu Val Lys 
275 280 285 

Gb Gb Ala Xaa Gb Xaa Arg Asn Gb Gb Lys He Lys Gb Ab Lys 
290 295 300 

Gb Lys Val Gb Ser Lys Lys Ab Gb Ab Thr Arg Leu Gb Lys De 
305 310 315 320 

Lys Thr Asp Arg Lys Lys AlaGbGb Glu Ab Lys Arg Lys Ala Gb 
325 330 335 

Gb Ser Gb Lys Lys Ala Ab Gb Ab Asp Gb Gb Lys De Lys Gb 
340 345 350 

Ab Lys Ab Lys Val Gb Ser Lys Lys Ab Gb Ab Thr Arg Leu Gb 
355 360 365 

Asn lb Lys Thr Asp Xaa Lys Lys Ab Gb Gb Gb Xaa Lys Arg Lys 
370 375 380 

Ab Ala Gb Gb Asp Lys Ser Lys Leu Asp Thr Lys Lys Ab Lys Leu 
385 390 395 400 

Ser Lys Leu Gb Gb Leu Ser Asp Lys tie Asp Gb Leu Asp AbGb 
405 410 415 

lie Ab Lys Leu Gb Val Gb Leu Lys Asp Ab Gb Gry Asn Asn Asn 
420 425 430 

Val Glu Ab Tyr Phe Lys Gb Gry Val Lys Glu Lys Pro Ab Gb Gb 
435 440 445 

Leu Gb Lys Thr Thr Ab Gb Lys Lys Ab Gb Leu Gb Lys Ab Gb 
450 455 460 
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Ala Asp Leu Lys Lys Ala Val Asp Glu Pro Glu Tlir Pro Ala Pro Ala 
465 470 475 480 

Pro Gin Pro Ala Pro Ala Pro Glu Lys Pro Ala Ghi Lys Pro Ala Pro 
485 490 495 

Ala Pro Pro Gin Pro Ah Pro Xaa Thr Gin Pro Ghi Lys Pro Ala Pro 
500 505 510 

Lys Pro G hi Lys Pro Ala Glu Gin Pro Lys Ala Ghi Lys Ghi Lys Pro 
515 520 525 

Ala Pro Ala Pro Ghi Lys Pro Ala Pro Ala Pro Glu Lys Pro Ala Pro 
530 535 540 

Ala Pro Ghi Lys Pro Ala Pro Ala Pro Gin Lys Pro Ala Pro Thr Pro 
545 550 555 560 

Gin Thr Pro Lys tbr Thr Asp Asp Gin Gin Ala Ghi Ghi Asp Tyr Ala 
565 570 575 

Arg Arg Ser Glu Ghi GhiTyr Asn Arg Leu Xaa Gin Gin GtaProPro 
580 585 590 

Lys Thr Ghi Lys Pro Ala Gh Pro Xaa Thr Pro Lys Hit 
595 600 605 

(2) INFORMATION FOR SEQ ID NO:47: 

00 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 623 amino acids 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: amino acid 



06) SEQUENCE DESOUPTTON: SEQ ID NO:47: 

Ala Lys Lys Asp Ala Lys Asn Ala Lys Lys Ala Val Ghi Asp Ala Gin 
15 10 15 

Lys Ala Leu Asp Asp Ala Lys Ala Ala Gin Lys Lys Tyr Asp Ghi Asp 
20 25 30 

Gin Lys Lys Thr Ghi Glu Lys Ala Ala Leu Ghi Lys Ala Ala Ser Glu 
35 40 45 

Ghi Met Ala Lys Thr Gin His Arg Lys Ala Ala Lys Xaa Val Val Asp 
50 55 60 

Ghi Tyr He Glu Lys Met Leu Arg Ghi lie Gm Leu Asp Arg Arg Lys 
65 70 75 80 

His Thr Gin Asn Val Ala Leu Asn lie Lys Leu Ser Ala lie Xaa Asp 
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*5 90 95 

LysAhValAkAk ValGkGta AkTyrLeu AkTyr GkGk Ak 
100 105 110 

Thr Asp Lys Ala Ala Lys Asp Ala Ala Asp Lys Met He Asp Ghi Ala 
115 120 125 

Lys Lys Arg Ghi Ghi Gk Ala Lys Iftr Lys Phe Asa Thr Vai Arg Ak 
130 135 140 

Met Thr Lys Tyr Leu Air Ghi Leu Xaa Val Xaa Glu Glu Lys Sex Xaa 
145 150 155 160 

Xaa Ghi Leo Pro Ser Ghi He Lys Ala Lys Leo Asp Ala Ala Phe Xaa 
165 170 175 

Lys Phe Lys Lys Asp Val Val Pro Ghi Pro Ghi Gin Leu Ala Glu Thr 
180 185 190 

Lys Lys Lys Ser Ghi Ghi Ak Lys Gin Lys Ala Pro Ghi Leu TV Lys 
195 200 205 

Lys Leu Gh Ghi Ala Lys Ala Lys Leu Ghi Glu Ala Gin Lys Lys Ala 
210 215 220 

Tbr Glu Ala Lys Gin Lys Val Thr Leu Lys Pro Gly Ghi Lys Val Ala 
225 230 235 240 

Ghi Ak Lys Lys Xaa Val Ghi Ghi Ah Lys Xaa Lys Ala Xaa Asp Gin 
245 250 255 

Lys Glu Ghi Asp Arg Arg Asa Tyr Pro Thr Asa Thr Xaa Lys Thr Leu 
260 265 270 

Asp Ak Ghi Glu Val Ala Pro Gin Ala Lys He Ala Glu Leu Glu Asn 
275 280 285 

Gk Val His Arg Leu Glu Gk Glu Leu Lys Ghi He Asp Glu Ser Ghi 
290 295 300 

Ser Gk Asp Tyr Ak Lys Gk Gly Phe Arg Ala Pro Leu Gta Ser Lys 
305 310 315 320 

Leu Asp Asp Leu Gk Tk Ak Gk Xaa Asp Val Lys Val Lys Ghi Ak 
325 330 335 

GkLeuGkLeuValLysGkGkAkXaaGkXaaArgAspGk Gk 
340 34S 350 

Lys Ik Lys Gk Ak Lys Ak Lys Val Glu Ak Lys Lys Ak Lys Leu 
355 360 365 

Ser Lys Leu Gk Gk Leu Ser Asp Lys lie Asp Glu Leu Asp Ak Glu 
370 375 380 

He Ak Lys Leu Glu Asp Gk Leu Lys Ak Ak Glu Gk Asn Asn Asn 
385 390 395 400 
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Val Glu Asp Tyr Phe Lys Glu Gry Leu Glu Lys Thr Ser Lys Lys Ala 
405 410 415 

Glu Ala Thr Arg Leu Glu Asn lie lie Ala Ala Lys Lys Ala Glu Leu 
420 425 430 

Ghi Lys Tfar Glu Ala Asp Leu Lys Lys Ala Val Asn Ghi Pro Glu Lys 
435 440 445 

Pro Ala Pro Ala Pro Glu Thr Pro Ala Pro Ghi Ala Pro Ala Ghi Gin 
450 455 46a 

Pro Lys Pro Ala Pro Ala Pro Gin Pro Ala Lys Thr Asp Xaa Lys Lys 
465 470 475 480 

Ala Glu Glu Glu Xaa Lys Arg Lys Ala Ala Glu Ghi Asp Lys Val Lys 
485 490 495 

Ghi Lys Pro Ala Glu Gin Pro Gbi Pro Ala Pro Xaa Thr Gin Pro Glu 
500 505 510 

Pro Ala Pro Lys Pro Ghi Lys Pro Ala Ghi Gta Pro Lys Pro Glu Lys 

515 520 525 

Thr Asp Asp Gin Ghi Ala Ghi Glu Asp Tyr Ala Arg Arg Ser Ghi Glu 
530 535 540 

Glu Tyr Asn Arg Leu Thr Gin Gtn Gin Pro Pro Lys Ala Glu Lys Pro 
545 550 555 560 

Ala Lys Pro Ala Pro Lys Pro Glu Lys Pro Ala Glu Gin Pro Lys Ala 
565 570 575 

Ghi Lys Thr De Asp Gin Gin Ala Ghi Ghi Glu Tyr Ala Arg Arg Ser 
580 585 590 

Ghi Ghi Glu Tyr Asn Arg Leu Xaa Gin Gin Gin Pro Pro Lys Thr Glu 
595 600 605 

Lys Pro Ala Pro Ala Pro Lys Thr Gin Pro Xaa Thr Pro Lys Tlir 
610 615 620 



Claims 

1. An immunological composition comprising at least two different full length isolated PspAs. 

2. An immunological composition comprising at least two different isolated PspAs. 

3. The immunological composition of ciaim 2 wherein the two PspAs are from different groups based on restriction 
fragment polymorphism analysis. 
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